Solved

How to save a web page with its url as file name

Posted on 2011-02-22
10
228 Views
Last Modified: 2012-05-11
Hi,

I want to save a web page with its url being the file name. However, there are quite a lot of characters that can't be accepted by either Ubuntu or Windows file systems. Please see below:

http://www.bestbuy.com/site/HP+-+Laptop+/+AMD+Phenom%26%23153%3B+II+Processor+/+15.6%22+Display+/+3GB+Memory+/+320GB+Hard+Drive+-+Biscotti/1945374.p?id=1218301987141&skuId=1945374

I want to know how to convert such a url to an acceptable file name.

Thanks
0
Comment
Question by:wsyy
  • 4
  • 3
  • 2
  • +1
10 Comments
 
LVL 47

Assisted Solution

by:for_yan
for_yan earned 62 total points
ID: 34958500
0
 
LVL 40

Accepted Solution

by:
gurvinder372 earned 63 total points
ID: 34958526
0
 

Author Comment

by:wsyy
ID: 34958570
for_yan,

what about the url contains chinese character, and how the encoding with "utf-8" will affect the result?

is the "utf-8" picked by randomly? or should i detect the encoding of the url first?
0
Are your AD admin tools letting you down?

Managing Active Directory can get complicated.  Often, the native tools for managing AD are just not up to the task.  The largest Active Directory installations in the world have relied on one tool to manage their day-to-day administration tasks: Hyena. Start your trial today.

 
LVL 47

Expert Comment

by:for_yan
ID: 34958584
You don't need to use UTF:
This explanation is from the first link which gurvinder posted:


The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same.
The special characters ".", "-", "*", and "_" remain the same.
The space character " " is converted into a plus sign "+".
All other characters are unsafe and are first converted into one or more bytes using some encoding scheme. Then each byte is represented by the 3-character string "%xy", where xy is the two-digit hexadecimal representation of the byte. The recommended encoding scheme to use is UTF-8. However, for compatibility reasons, if an encoding is not specified, then the default encoding of the platform is used.
For example using UTF-8 as the encoding scheme the string "The string ü@foo-bar" would get converted to "The+string+%C3%BC%40foo-bar" because in UTF-8 the character ü is encoded as two bytes C3 (hex) and BC (hex), and the character @ is encoded as one byte 40 (hex).



0
 

Author Comment

by:wsyy
ID: 34958668
for_yan, thanks for more inputs.

if the url contains chinese words, i do want to keep the chinese words in the file name. do i need to use "gb2312" or "gb18030"? or I can just keep using "utf-8".

the reason I ask is that I don't know if the url (as an input from other application) is encoded in utf-8 or not.
0
 
LVL 47

Expert Comment

by:for_yan
ID: 34958698
I'm not sure, you can give it a try. Are  chinese charcaters OK to be in the file names?
0
 

Author Comment

by:wsyy
ID: 34962294
yes. it is ok to have chinese characters in file name.
0
 
LVL 47

Expert Comment

by:for_yan
ID: 34962321
Then just try both ways - I cannot try myself - I don't have chinese characters
0
 
LVL 20

Expert Comment

by:Sathish David Kumar N
ID: 34964899
use big5
0
 
LVL 20

Expert Comment

by:Sathish David Kumar N
ID: 34964909
Big5
0

Featured Post

Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

For customizing the look of your lightweight component and making it look lucid like it was made of glass. Or: how to make your component more Apple-ish ;) This tip assumes your component to be of rectangular shape and completely opaque. (COD…
After being asked a question last year, I went into one of my moods where I did some research and code just for the fun and learning of it all.  Subsequently, from this journey, I put together this article on "Range Searching Using Visual Basic.NET …
Video by: Michael
Viewers learn about how to reduce the potential repetitiveness of coding in main by developing methods to perform specific tasks for their program. Additionally, objects are introduced for the purpose of learning how to call methods in Java. Define …
Viewers will learn about arithmetic and Boolean expressions in Java and the logical operators used to create Boolean expressions. We will cover the symbols used for arithmetic expressions and define each logical operator and how to use them in Boole…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question