?
Solved

Excel URLDownloadToFile Imports HTML as Scientific Notation Instead of Text

Posted on 2013-11-27
5
Medium Priority
?
618 Views
Last Modified: 2013-12-04
I have a VBA module that uses the URLDownloadtoFile function to grab some table data from a company website and paste it into excel.

The function works fine, but it converts some of my text to scientific. As I am using the function instead of copy and pasting the data, I cannot specify the column as text, etc.

Does anyone know of a potential solution?

(My workaround solution is to download the file to a word document and then copy the word table into excel. However, this is a less than optimal fix.)

Thanks!
0
Comment
Question by:ashleyna
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
5 Comments
 
LVL 42

Expert Comment

by:pcelba
ID: 39682295
URLDownloadtoFile function takes the HTML page "as is" and saves it to a disk file. So depends on the HTML page encoding it can appear as "Scientific Notation" to you...

The page encoding can be UTF8, Windows CP 1250, 1252, etc. You have to look inside and decode it to your target encoding (CP 1252 does not need any decoding because it is English).

If you download this page the you may see following tag:
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
which says the page is encoded in UTF-8 which is quite readable but e.g. http://www.baiyujia.com/ having charset=gb2312 is almost unreadable...

MS Word can recognize the encoding so it interprets the page correctly.

Possible VBA implementation for CP conversion is described e.g. here: http://www.pcreview.co.uk/forums/convert-string-one-codepage-another-t3082087.html
or here: https://groups.google.com/forum/#!topic/vbahelp/TjfCsyb1P7s
0
 

Author Comment

by:ashleyna
ID: 39682709
The page downloads without tags when I use .xls extension. However, it converts data like 228E67 to scientific notation. The only output that preserves the text is when I export to a .doc file.

The HTML does not download with tags for any extension except .txt
0
 
LVL 42

Accepted Solution

by:
pcelba earned 1500 total points
ID: 39682986
URLDownloadtoFile does not recognize any file type. So if you set "XLS" as the output file extension it simply sets this extension without regard to the actual file contents... Setting the file extension cannot remove HTML tags from the output file.

So if you are saving the file as XLS you should be sure the file is really an Excel sheet.

OTOH Excel or Word is attempting to recognize what is the right file contents and opens the file the best way it can.

If you really want to see what is inside the downloaded file then change its extension to TXT and open it in Notepad. (Suppose extensions are displayed in your Operating system.)

228E67 can be any Unicode or UTF-8 character or just character from the upper half  of ASCII table etc.
0
 

Author Comment

by:ashleyna
ID: 39696244
If anyone else comes across this question, one way I was able to convert scientific values back into the original text with the following formula:

=LEFT(TEXT(C495,"#"),5)&"E"&TEXT(LEN(TEXT(C495,"#"))-5,"0##")

The formula converts value like: 4.5336E+113 to 45336E109, which was the original text value.
0
 
LVL 42

Expert Comment

by:pcelba
ID: 39696335
Thanks. I know better what was the problem now but I still don't know what was the source file format imported to Excel.

Excel automatically converts "number like" text values into numbers. To suppress this conversion you have to make changes in the input file which is not so easy in some cases...

This discussion can tell more: http://stackoverflow.com/questions/2420931/stop-excel-from-changing-cell-contents-ever

Another hint useful when importing text from CSV file is to format the text as formula:
="123456"
0

Featured Post

Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Technology opened people to different means of presenting information, but PowerPoint remains to be above competition. Know why PPT still works today.
This article will show how Aten was able to supply easy management and control for Artear's video walls and wide range display configurations of their newsroom.
The viewer will learn how to create two correlated normally distributed random variables in Excel, use a normal distribution to simulate the return on different levels of investment in each of the two funds over a period of ten years, and, create a …
Progress
Suggested Courses

800 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question