Solved

Excel URLDownloadToFile Imports HTML as Scientific Notation Instead of Text

Posted on 2013-11-27
5
601 Views
Last Modified: 2013-12-04
I have a VBA module that uses the URLDownloadtoFile function to grab some table data from a company website and paste it into excel.

The function works fine, but it converts some of my text to scientific. As I am using the function instead of copy and pasting the data, I cannot specify the column as text, etc.

Does anyone know of a potential solution?

(My workaround solution is to download the file to a word document and then copy the word table into excel. However, this is a less than optimal fix.)

Thanks!
0
Comment
Question by:ashleyna
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
5 Comments
 
LVL 42

Expert Comment

by:pcelba
ID: 39682295
URLDownloadtoFile function takes the HTML page "as is" and saves it to a disk file. So depends on the HTML page encoding it can appear as "Scientific Notation" to you...

The page encoding can be UTF8, Windows CP 1250, 1252, etc. You have to look inside and decode it to your target encoding (CP 1252 does not need any decoding because it is English).

If you download this page the you may see following tag:
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
which says the page is encoded in UTF-8 which is quite readable but e.g. http://www.baiyujia.com/ having charset=gb2312 is almost unreadable...

MS Word can recognize the encoding so it interprets the page correctly.

Possible VBA implementation for CP conversion is described e.g. here: http://www.pcreview.co.uk/forums/convert-string-one-codepage-another-t3082087.html
or here: https://groups.google.com/forum/#!topic/vbahelp/TjfCsyb1P7s
0
 

Author Comment

by:ashleyna
ID: 39682709
The page downloads without tags when I use .xls extension. However, it converts data like 228E67 to scientific notation. The only output that preserves the text is when I export to a .doc file.

The HTML does not download with tags for any extension except .txt
0
 
LVL 42

Accepted Solution

by:
pcelba earned 500 total points
ID: 39682986
URLDownloadtoFile does not recognize any file type. So if you set "XLS" as the output file extension it simply sets this extension without regard to the actual file contents... Setting the file extension cannot remove HTML tags from the output file.

So if you are saving the file as XLS you should be sure the file is really an Excel sheet.

OTOH Excel or Word is attempting to recognize what is the right file contents and opens the file the best way it can.

If you really want to see what is inside the downloaded file then change its extension to TXT and open it in Notepad. (Suppose extensions are displayed in your Operating system.)

228E67 can be any Unicode or UTF-8 character or just character from the upper half  of ASCII table etc.
0
 

Author Comment

by:ashleyna
ID: 39696244
If anyone else comes across this question, one way I was able to convert scientific values back into the original text with the following formula:

=LEFT(TEXT(C495,"#"),5)&"E"&TEXT(LEN(TEXT(C495,"#"))-5,"0##")

The formula converts value like: 4.5336E+113 to 45336E109, which was the original text value.
0
 
LVL 42

Expert Comment

by:pcelba
ID: 39696335
Thanks. I know better what was the problem now but I still don't know what was the source file format imported to Excel.

Excel automatically converts "number like" text values into numbers. To suppress this conversion you have to make changes in the input file which is not so easy in some cases...

This discussion can tell more: http://stackoverflow.com/questions/2420931/stop-excel-from-changing-cell-contents-ever

Another hint useful when importing text from CSV file is to format the text as formula:
="123456"
0

Featured Post

On Demand Webinar: Networking for the Cloud Era

Ready to improve network connectivity? Watch this webinar to learn how SD-WANs and a one-click instant connect tool can boost provisions, deployment, and management of your cloud connection.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Since upgrading to Office 2013 or higher installing the Smart Indenter addin will fail. This article will explain how to install it so it will work regardless of the Office version installed.
Computer science students often experience many of the same frustrations when going through their engineering courses. This article presents seven tips I found useful when completing a bachelors and masters degree in computing which I believe may he…
The viewer will learn how to create two correlated normally distributed random variables in Excel, use a normal distribution to simulate the return on different levels of investment in each of the two funds over a period of ten years, and, create a …
If you’ve ever visited a web page and noticed a cool font that you really liked the look of, but couldn’t figure out which font it was so that you could use it for your own work, then this video is for you! In this Micro Tutorial, you'll learn yo…

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question