Solved

Excel URLDownloadToFile Imports HTML as Scientific Notation Instead of Text

Posted on 2013-11-27
5
596 Views
Last Modified: 2013-12-04
I have a VBA module that uses the URLDownloadtoFile function to grab some table data from a company website and paste it into excel.

The function works fine, but it converts some of my text to scientific. As I am using the function instead of copy and pasting the data, I cannot specify the column as text, etc.

Does anyone know of a potential solution?

(My workaround solution is to download the file to a word document and then copy the word table into excel. However, this is a less than optimal fix.)

Thanks!
0
Comment
Question by:ashleyna
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
5 Comments
 
LVL 42

Expert Comment

by:pcelba
ID: 39682295
URLDownloadtoFile function takes the HTML page "as is" and saves it to a disk file. So depends on the HTML page encoding it can appear as "Scientific Notation" to you...

The page encoding can be UTF8, Windows CP 1250, 1252, etc. You have to look inside and decode it to your target encoding (CP 1252 does not need any decoding because it is English).

If you download this page the you may see following tag:
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
which says the page is encoded in UTF-8 which is quite readable but e.g. http://www.baiyujia.com/ having charset=gb2312 is almost unreadable...

MS Word can recognize the encoding so it interprets the page correctly.

Possible VBA implementation for CP conversion is described e.g. here: http://www.pcreview.co.uk/forums/convert-string-one-codepage-another-t3082087.html
or here: https://groups.google.com/forum/#!topic/vbahelp/TjfCsyb1P7s
0
 

Author Comment

by:ashleyna
ID: 39682709
The page downloads without tags when I use .xls extension. However, it converts data like 228E67 to scientific notation. The only output that preserves the text is when I export to a .doc file.

The HTML does not download with tags for any extension except .txt
0
 
LVL 42

Accepted Solution

by:
pcelba earned 500 total points
ID: 39682986
URLDownloadtoFile does not recognize any file type. So if you set "XLS" as the output file extension it simply sets this extension without regard to the actual file contents... Setting the file extension cannot remove HTML tags from the output file.

So if you are saving the file as XLS you should be sure the file is really an Excel sheet.

OTOH Excel or Word is attempting to recognize what is the right file contents and opens the file the best way it can.

If you really want to see what is inside the downloaded file then change its extension to TXT and open it in Notepad. (Suppose extensions are displayed in your Operating system.)

228E67 can be any Unicode or UTF-8 character or just character from the upper half  of ASCII table etc.
0
 

Author Comment

by:ashleyna
ID: 39696244
If anyone else comes across this question, one way I was able to convert scientific values back into the original text with the following formula:

=LEFT(TEXT(C495,"#"),5)&"E"&TEXT(LEN(TEXT(C495,"#"))-5,"0##")

The formula converts value like: 4.5336E+113 to 45336E109, which was the original text value.
0
 
LVL 42

Expert Comment

by:pcelba
ID: 39696335
Thanks. I know better what was the problem now but I still don't know what was the source file format imported to Excel.

Excel automatically converts "number like" text values into numbers. To suppress this conversion you have to make changes in the input file which is not so easy in some cases...

This discussion can tell more: http://stackoverflow.com/questions/2420931/stop-excel-from-changing-cell-contents-ever

Another hint useful when importing text from CSV file is to format the text as formula:
="123456"
0

Featured Post

Ransomware-A Revenue Bonanza for Service Providers

Ransomware – malware that gets on your customers’ computers, encrypts their data, and extorts a hefty ransom for the decryption keys – is a surging new threat.  The purpose of this eBook is to educate the reader about ransomware attacks.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Displaying an arrayList in a listView using the default adapter is rarely the best solution. To get full control of your display data, and to be able to refresh it after editing, requires the use of a custom adapter.
Today, the web development industry is booming, and many people consider it to be their vocation. The question you may be asking yourself is – how do I become a web developer?
The view will learn how to download and install SIMTOOLS and FORMLIST into Excel, how to use SIMTOOLS to generate a Monte Carlo simulation of 30 sales calls, and how to calculate the conditional probability based on the results of the Monte Carlo …

739 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question