Solved

MAXSCANROWS With Excel and ADO/OBDC

Posted on 2002-06-16
11
4,159 Views
Last Modified: 2013-11-20
Using VC6, SP5 with ADO and finding that in connecting to Excel the MaxScanRows=0 seems to be ignored [fails in ODBC as well and I assume the same driver is used].  A column containing data with many blank values is ignored when there is data in the rows below row 15.  The connection string is:

"Provider=MSDASQL.1;Persist Security Info=False;Extended Properties="DSN=Excel Files;DBQ=C:\Temp\test.xls;DefaultDir=C:\Temp;DriverId=790;MaxBufferSize=2048;PageTimeout=5;ColNameHeader=True;MaxScanRows=0;""

Using 25 also fails so the default of 7 is always used.  This is rather annoying and I am at loss as to why this behaviour occurs.  Can anyone offer any suggestions please?

Thanks in advance

David

0
Comment
Question by:Rupreck
  • 7
  • 4
11 Comments
 
LVL 49

Expert Comment

by:DanRollins
ID: 7083085
The ODBC driver for EXCEL will scan at most 16 rows in order to determine data type.  This is written is several documentation sources, including:
http://support.microsoft.com/default.aspx?scid=kb;EN-US;q278973

http://support.microsoft.com/default.aspx?scid=kb;EN-US;q316934

(neither is exactly on point, but they are related) Also the online help says the same thing when you use the ODBC ControlPanel to configure a DSN for Excel. Alsom the ODBC Control panel indicates an error if you input >16  *or if you input 0*

True, I found a place or two where the dox mention that MaxScanRows=0 will force a scan of the entire table, but these are usually in conjunction with 'generic' handling of ODBC DSN configuration.  Driver-specific info will always rule.

One solution:  Make the first row contain some sample data that the driver will interpret correctly as numeric or string or date, or whatever.  Then discard the first row when you read the table.  You can also output the XLS grid to a CSV file and if you do that you can use the Text driver -- that lets you define the data type and heading names, etc of each column via the Schema.ini file.

Finally, the columns that are 'wrong' are probably coming in as text.  Just convert them to number or date or whatever.

-- Dan
0
 

Author Comment

by:Rupreck
ID: 7084344
Thanks Dan for this answer which is very clear.  Unfortunately this is in a program that allows the user to use the MS Datalink to make the connection dynamically to any number of files so would rather avoid adding data temporarily to it, to then remove it.  Perhaps I could work from a copy.  I will wait to see if anyone else comes up with a soln before giving you the points
0
 
LVL 49

Expert Comment

by:DanRollins
ID: 7084557
Thanks.  Is it tru that the 'untyped' columns come in as text?  And if so, what is bad about that?

-- Dan
0
 

Author Comment

by:Rupreck
ID: 7088485
I think they come in as text but what is happening is that in one example a column has nulls in the 1st 15 rows and then has data below.  The driver seems to assume that the column is entirely blank and so just doesn't import any data at all from that column.

This I think is dumb behaviour and even with MaxScanRows=16 it is ignored so maybe there is something else going on.  If I place data in any of cells 1-8 then it reads the whole column perfectly placing blanks where it should.  Even if I get it to work with 16 that is not good enough as it may be blank up to say row 200, then have data below.  [This problem can be easily replicated with just a single column with blanks in the first 8 rows]

If it defaulted to text and then read it all as that then I would be happy but in the case of blanks it just seems to ignore the column.
0
 

Author Comment

by:Rupreck
ID: 7089904
Further to that I now see that when it fails it is defaulting to text (m_nSQLType=12, m_nPrecision=255).  Ideally I would want it to resort to numeric (m_nSQLType=8, m_nPrecision=15).

In
 
m_rSheet->GetFieldValue(column, m_stempSql);

m_stempSql is always blank when the column has this abundunce of nulls in the 1st 8+ rows.  


[Aside] Also I noticed this in MSDN:

The following example uses the ::SQLConfigDataSource ODBC API function to create a new Excel data source called “New Excel Data Source”:

SQLConfigDataSource(NULL,ODBC_ADD_DSN, "Excel Files (*.xls)",
                   "DSN=New Excel Data Source\0"
                   "Description=New Excel Data Source\0"
                   "FileType=Excel\0"
                   "DataDirectory=C:\\EXCELDIR\0"
                   "MaxScanRows=20\0");

Note that the data source is actually a directory (C:\EXCELDIR); this directory must exist. The Excel driver uses directories as its data sources, and files as the individual tables (one table per .XLS file).

Here they use 20 and so perhaps it is possible with Excel to use more than 16.  At best their documentation team is confused...

0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 49

Expert Comment

by:DanRollins
ID: 7090555
I saw that same example and wondered about the 20 also; and anopther reference clearly indicates that 0 forces a full scan.  

It might vary by version of Excel and the Excel ODBC driver.  I suppose that you have downloaded and installed the latest MDAC...
-- Dan
0
 

Author Comment

by:Rupreck
ID: 7092095
Yes, I have the latest version.  The problem is not about the determination of datatype but that when it finds blanks in the first 7-16 rows it then just ignores the column entirely not even reading the contents (text or number) in later rows.

This must be a driver bug surely?
0
 

Author Comment

by:Rupreck
ID: 7092203
Yes, I have the latest version.  The problem is not about the determination of datatype but that when it finds blanks in the first 7-16 rows it then just ignores the column entirely not even reading the contents (text or number) in later rows.

This must be a driver bug surely?
0
 
LVL 49

Accepted Solution

by:
DanRollins earned 200 total points
ID: 7093117
O supose its worth trying to tweak the registry setting directly to see if it makes a difference.  I created a new DSN aand it puts its data into:

   HKEY_CURRENT_USER\Software\ODBC\ODBC.INI\zzzExcelTest\Engines\Excel

You could try different values in the MaxScanRows item.

-=-=-=-=-=-=-=-=-=-=-=-=
>>The problem is...it then just ignores the column entirely...
>>This must be a driver bug surely?

I also found reference to a regkey named
    TypeGuessRows [DWORD] n
    ImportMixedTypes [STRING] "Text"

it was for jet excel (Msexcl35.dll driver) but it shed some light on this.  If the driver guesses the data type incorrectly, but then discovers some data of a different type, it must return null.  Thus, I would say it is NOT a bug.  Imagine that it guessed data type of DATE, but later ran into a string value like "Yesterday."   A LOT of software would simply choke in trying to handle that as a usable DATE -- maybe it would come out as 1969 or something or just crash.  So it would be safest to return null since all software is expected to be able to handle that.

-- Dan
0
 

Author Comment

by:Rupreck
ID: 7097428
Unfortunately making the changes to the registry has not made any difference however I think your theory is correct - albeit annoying!  I am wondering whether there is a workaround eg a way of forcing the column to assume a particular type; so I will look around.
0
 

Author Comment

by:Rupreck
ID: 7137121
No solution found but a much better understanding of the issues here.
0

Featured Post

Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
if loop error 4 77
Need Help INsttalling wget on Mavericks OS X 3 83
sum13 challenge 24 77
Thin secure Windows 10 5 49
Here is how to use MFC's automatic Radio Button handling in your dialog boxes and forms.  Beginner programmers usually start with a OnClick handler for each radio button and that's just not the right way to go.  MFC has a very cool system for handli…
Introduction: Displaying information on the statusbar.   Continuing from the third article about sudoku.   Open the project in visual studio. Status bar – let’s display the timestamp there.  We need to get the timestamp from the document s…
This video will show you how to get GIT to work in Eclipse.   It will walk you through how to install the EGit plugin in eclipse and how to checkout an existing repository.
This video explains how to create simple products associated to Magento configurable product and offers fast way of their generation with Store Manager for Magento tool.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now