Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

Excel URLDownloadToFile Imports HTML as Scientific Notation Instead of Text

Posted on 2013-11-27
5
579 Views
Last Modified: 2013-12-04
I have a VBA module that uses the URLDownloadtoFile function to grab some table data from a company website and paste it into excel.

The function works fine, but it converts some of my text to scientific. As I am using the function instead of copy and pasting the data, I cannot specify the column as text, etc.

Does anyone know of a potential solution?

(My workaround solution is to download the file to a word document and then copy the word table into excel. However, this is a less than optimal fix.)

Thanks!
0
Comment
Question by:ashleyna
  • 3
  • 2
5 Comments
 
LVL 42

Expert Comment

by:pcelba
ID: 39682295
URLDownloadtoFile function takes the HTML page "as is" and saves it to a disk file. So depends on the HTML page encoding it can appear as "Scientific Notation" to you...

The page encoding can be UTF8, Windows CP 1250, 1252, etc. You have to look inside and decode it to your target encoding (CP 1252 does not need any decoding because it is English).

If you download this page the you may see following tag:
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
which says the page is encoded in UTF-8 which is quite readable but e.g. http://www.baiyujia.com/ having charset=gb2312 is almost unreadable...

MS Word can recognize the encoding so it interprets the page correctly.

Possible VBA implementation for CP conversion is described e.g. here: http://www.pcreview.co.uk/forums/convert-string-one-codepage-another-t3082087.html
or here: https://groups.google.com/forum/#!topic/vbahelp/TjfCsyb1P7s
0
 

Author Comment

by:ashleyna
ID: 39682709
The page downloads without tags when I use .xls extension. However, it converts data like 228E67 to scientific notation. The only output that preserves the text is when I export to a .doc file.

The HTML does not download with tags for any extension except .txt
0
 
LVL 42

Accepted Solution

by:
pcelba earned 500 total points
ID: 39682986
URLDownloadtoFile does not recognize any file type. So if you set "XLS" as the output file extension it simply sets this extension without regard to the actual file contents... Setting the file extension cannot remove HTML tags from the output file.

So if you are saving the file as XLS you should be sure the file is really an Excel sheet.

OTOH Excel or Word is attempting to recognize what is the right file contents and opens the file the best way it can.

If you really want to see what is inside the downloaded file then change its extension to TXT and open it in Notepad. (Suppose extensions are displayed in your Operating system.)

228E67 can be any Unicode or UTF-8 character or just character from the upper half  of ASCII table etc.
0
 

Author Comment

by:ashleyna
ID: 39696244
If anyone else comes across this question, one way I was able to convert scientific values back into the original text with the following formula:

=LEFT(TEXT(C495,"#"),5)&"E"&TEXT(LEN(TEXT(C495,"#"))-5,"0##")

The formula converts value like: 4.5336E+113 to 45336E109, which was the original text value.
0
 
LVL 42

Expert Comment

by:pcelba
ID: 39696335
Thanks. I know better what was the problem now but I still don't know what was the source file format imported to Excel.

Excel automatically converts "number like" text values into numbers. To suppress this conversion you have to make changes in the input file which is not so easy in some cases...

This discussion can tell more: http://stackoverflow.com/questions/2420931/stop-excel-from-changing-cell-contents-ever

Another hint useful when importing text from CSV file is to format the text as formula:
="123456"
0

Featured Post

The Eight Noble Truths of Backup and Recovery

How can IT departments tackle the challenges of a Big Data world? This white paper provides a roadmap to success and helps companies ensure that all their data is safe and secure, no matter if it resides on-premise with physical or virtual machines or in the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Whether you’re a college noob or a soon-to-be pro, these tips are sure to help you in your journey to becoming a programming ninja and stand out from the crowd.
Although it can be difficult to imagine, someday your child will have a career of his or her own. He or she will likely start a family, buy a home and start having their own children. So, while being a kid is still extremely important, it’s also …
With the power of JIRA, there's an unlimited number of ways you can customize it, use it and benefit from it. With that in mind, there's bound to be things that I wasn't able to cover in this course. With this summary we'll look at some places to go…

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question