Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

How to save a web page with its url as file name

Posted on 2011-02-22
10
Medium Priority
?
264 Views
Last Modified: 2012-05-11
Hi,

I want to save a web page with its url being the file name. However, there are quite a lot of characters that can't be accepted by either Ubuntu or Windows file systems. Please see below:

http://www.bestbuy.com/site/HP+-+Laptop+/+AMD+Phenom%26%23153%3B+II+Processor+/+15.6%22+Display+/+3GB+Memory+/+320GB+Hard+Drive+-+Biscotti/1945374.p?id=1218301987141&skuId=1945374

I want to know how to convert such a url to an acceptable file name.

Thanks
0
Comment
Question by:wsyy
  • 4
  • 3
  • 2
  • +1
10 Comments
 
LVL 47

Assisted Solution

by:for_yan
for_yan earned 248 total points
ID: 34958500
0
 
LVL 40

Accepted Solution

by:
Gurvinder Pal Singh earned 252 total points
ID: 34958526
0
 

Author Comment

by:wsyy
ID: 34958570
for_yan,

what about the url contains chinese character, and how the encoding with "utf-8" will affect the result?

is the "utf-8" picked by randomly? or should i detect the encoding of the url first?
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
LVL 47

Expert Comment

by:for_yan
ID: 34958584
You don't need to use UTF:
This explanation is from the first link which gurvinder posted:


The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same.
The special characters ".", "-", "*", and "_" remain the same.
The space character " " is converted into a plus sign "+".
All other characters are unsafe and are first converted into one or more bytes using some encoding scheme. Then each byte is represented by the 3-character string "%xy", where xy is the two-digit hexadecimal representation of the byte. The recommended encoding scheme to use is UTF-8. However, for compatibility reasons, if an encoding is not specified, then the default encoding of the platform is used.
For example using UTF-8 as the encoding scheme the string "The string ü@foo-bar" would get converted to "The+string+%C3%BC%40foo-bar" because in UTF-8 the character ü is encoded as two bytes C3 (hex) and BC (hex), and the character @ is encoded as one byte 40 (hex).



0
 

Author Comment

by:wsyy
ID: 34958668
for_yan, thanks for more inputs.

if the url contains chinese words, i do want to keep the chinese words in the file name. do i need to use "gb2312" or "gb18030"? or I can just keep using "utf-8".

the reason I ask is that I don't know if the url (as an input from other application) is encoded in utf-8 or not.
0
 
LVL 47

Expert Comment

by:for_yan
ID: 34958698
I'm not sure, you can give it a try. Are  chinese charcaters OK to be in the file names?
0
 

Author Comment

by:wsyy
ID: 34962294
yes. it is ok to have chinese characters in file name.
0
 
LVL 47

Expert Comment

by:for_yan
ID: 34962321
Then just try both ways - I cannot try myself - I don't have chinese characters
0
 
LVL 20

Expert Comment

by:Sathish David Kumar N
ID: 34964899
use big5
0
 
LVL 20

Expert Comment

by:Sathish David Kumar N
ID: 34964909
Big5
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

INTRODUCTION Working with files is a moderately common task in Java.  For most projects hard coding the file names, using parameters in configuration files, or using command-line arguments is sufficient.   However, when your application has vi…
Java contains several comparison operators (e.g., <, <=, >, >=, ==, !=) that allow you to compare primitive values. However, these operators cannot be used to compare the contents of objects. Interface Comparable is used to allow objects of a cl…
Viewers learn how to read error messages and identify possible mistakes that could cause hours of frustration. Coding is as much about debugging your code as it is about writing it. Define Error Message: Line Numbers: Type of Error: Break Down…
Viewers will learn one way to get user input in Java. Introduce the Scanner object: Declare the variable that stores the user input: An example prompting the user for input: Methods you need to invoke in order to properly get  user input:
Suggested Courses

926 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question