Solved

How to save a web page with its url as file name

Posted on 2011-02-22
10
209 Views
Last Modified: 2012-05-11
Hi,

I want to save a web page with its url being the file name. However, there are quite a lot of characters that can't be accepted by either Ubuntu or Windows file systems. Please see below:

http://www.bestbuy.com/site/HP+-+Laptop+/+AMD+Phenom%26%23153%3B+II+Processor+/+15.6%22+Display+/+3GB+Memory+/+320GB+Hard+Drive+-+Biscotti/1945374.p?id=1218301987141&skuId=1945374

I want to know how to convert such a url to an acceptable file name.

Thanks
0
Comment
Question by:wsyy
  • 4
  • 3
  • 2
  • +1
10 Comments
 
LVL 47

Assisted Solution

by:for_yan
for_yan earned 62 total points
ID: 34958500
0
 
LVL 40

Accepted Solution

by:
gurvinder372 earned 63 total points
ID: 34958526
0
 

Author Comment

by:wsyy
ID: 34958570
for_yan,

what about the url contains chinese character, and how the encoding with "utf-8" will affect the result?

is the "utf-8" picked by randomly? or should i detect the encoding of the url first?
0
 
LVL 47

Expert Comment

by:for_yan
ID: 34958584
You don't need to use UTF:
This explanation is from the first link which gurvinder posted:


The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same.
The special characters ".", "-", "*", and "_" remain the same.
The space character " " is converted into a plus sign "+".
All other characters are unsafe and are first converted into one or more bytes using some encoding scheme. Then each byte is represented by the 3-character string "%xy", where xy is the two-digit hexadecimal representation of the byte. The recommended encoding scheme to use is UTF-8. However, for compatibility reasons, if an encoding is not specified, then the default encoding of the platform is used.
For example using UTF-8 as the encoding scheme the string "The string ü@foo-bar" would get converted to "The+string+%C3%BC%40foo-bar" because in UTF-8 the character ü is encoded as two bytes C3 (hex) and BC (hex), and the character @ is encoded as one byte 40 (hex).



0
 

Author Comment

by:wsyy
ID: 34958668
for_yan, thanks for more inputs.

if the url contains chinese words, i do want to keep the chinese words in the file name. do i need to use "gb2312" or "gb18030"? or I can just keep using "utf-8".

the reason I ask is that I don't know if the url (as an input from other application) is encoded in utf-8 or not.
0
Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

 
LVL 47

Expert Comment

by:for_yan
ID: 34958698
I'm not sure, you can give it a try. Are  chinese charcaters OK to be in the file names?
0
 

Author Comment

by:wsyy
ID: 34962294
yes. it is ok to have chinese characters in file name.
0
 
LVL 47

Expert Comment

by:for_yan
ID: 34962321
Then just try both ways - I cannot try myself - I don't have chinese characters
0
 
LVL 20

Expert Comment

by:Sathish David Kumar N
ID: 34964899
use big5
0
 
LVL 20

Expert Comment

by:Sathish David Kumar N
ID: 34964909
Big5
0

Featured Post

Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

Join & Write a Comment

An old method to applying the Singleton pattern in your Java code is to check if a static instance, defined in the same class that needs to be instantiated once and only once, is null and then create a new instance; otherwise, the pre-existing insta…
Are you developing a Java application and want to create Excel Spreadsheets? You have come to the right place, this article will describe how you can create Excel Spreadsheets from a Java Application. For the purposes of this article, I will be u…
This tutorial covers a practical example of lazy loading technique and early loading technique in a Singleton Design Pattern.
This video teaches viewers about errors in exception handling.

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now