Link to home
Start Free TrialLog in
Avatar of ferrrar
ferrrar

asked on

ODBC bulk insert is slow with MSSQL

I have a C++ program, built with Visual C++ 2005, populating a table on a MS SQL server. It creates a table:

  create table main_table (loadorder int not null identity, primary key (loadorder), day_of_week int, hour_of_day int, messages_delivered int, messages_queued int)

and then populates it with a prepared query:

  insert into main_table (day_of_week, hour_of_day, messages_delivered, messages_queued) values (?, ?, ?, ?)

using SQLAllocStmt(), SQLPrepare(), SQLSetStmtAttr(), SQLBindParameter(). It loads 10,000 rows at a time. On one SQL Server, it loads it at about 1500 rows per second, which is (barely) acceptable; on another it loads it at about 100 rows per second, which is not acceptable. My question is: what is the difference between the two, and how can I speed up the slow one?

Other details:

  * SQL Server and application are both running on the same x64 Windows system, in both cases (local access to the SQL Server).

  * The application is 64-bit.

  * The server with fast queries is running Windows XP x64 Professional, Version 2003, SP2; the server with slow queries is running Windows Server 2003 Enterprise x64, SP2.

  * The server with fast queries is running Microsoft SQL Server Enterprise Edition (64-bit). The server with slow queries is running SQL Server Express 2005.

  * In both cases, I am using a DSN with all defaults, using the SQL Native Client 10 driver.

  * Both servers are physical hardware (not virtual).

  * The server with faster queries has 6GB RAM; the slower one has 8GB.

  * The server with faster queries has 2GHz processors; the slower one has 3GHz.

  * Both servers have similar hard drives--internal drives. The server with faster queries has slower drives.

  * Both databases have transaction logging set to SIMPLE.

  * The server with slower queries is running a German version of Windows; the fast one is English.

  * When I run the Activity Monitor within the SQL management console (Enterprise Manager?), it shows 950ms+ out of every second is waiting on Logging, on the server with slow queries; on the server with fast queries, logging is 11ms of every second. I imagine this is the issue, but can't figure out how to change it. I tried moving the transaction log to a different drive, but it does not affect performance. Just for kicks, I tried changing transaction logging to FULL, but it also does not affect performance. Again, logging is already SIMPLE for both databases.

  * Both databases are using the same charset / collation (SQL_Latin1_General_CP1_CI_AS)

I'm pretty well out of my depth with this one--I haven't configured or optimized MS SQL much. There must be some setting here that I just don't know to check. I know I could try BCP, but I've got this code working reasonably well on one server, and I'd just like it to work the same on the other. I'm not sure BCP would help anyway, if it's the transaction log that's the problem. It seems like it's a logging issue, but why doesn't affect both servers? Is it reasonable to expect 1000+ rows per second to a SQL Server Express, or is there some sort of built-in limit there? (I also saw 100rps to another Enterprise SQL Server, on a third box). Is there a charset conversion going on here because of the German?



Avatar of EugeneZ
EugeneZ
Flag of United States of America image

bcp is faster but may lock your table

1500 rows per second is not so bad: try to run it off hours it may run faster
Avatar of ferrrar
ferrrar

ASKER

EugeneZ, thanks for the info on bcp.

1500 rows per second wouldn't be too bad, but that's now what I'm getting on the server I want it on. I'm getting *100* rows per second on the server where I need to run this, and that's the problem. I'm getting 1500 rows per second on a *different* server, but it's not one we can use for the final installation. But it proves that 1500 rows per second is possible! So the question is: why can't I get that 1500rps on the server where we need to run this?!

"Off hours" won't help in this case--the SQL Server is completely unused, except for this application's own usage of it. So it's getting 100rps on a completely idle SQL Server.
then bcp -or dts\sis
Avatar of ferrrar

ASKER

EugeneZ, maybe bcp would help, but I'd rather use native ODBC if it can be made fast enough. It doesn't require bcp to be installed, and it doesn't require any additional development (it would be a fair amount of work to put together code to call bcp, and the ODBC code already exists). Now, if ODBC were a lost cause for this, and inherently too slow to handle this performance, then I'd have to be content with bcp, but I have a clear example where ODBC is fast enough, in an environment that seems very similar to the one where I want to run it. So I'm hoping that this is just a configuration issue, and that with some change in the operating system, or SQL Server, or the database, or maybe the application configuration, I can get the same performance (1500). So I'm looking for suggestions for configuration options, at one level or another, which might be responsible for for this slowdown.
bcp is on sql server: you can call it via t-sql   ->  exec master..xp_cmdshell   ' bcp...'
http://msdn.microsoft.com/en-us/library/ms175046.aspx


but you may like to use bulk insert instead of insert into  : if you are loading from file

http://msdn.microsoft.com/en-us/library/ms188365.aspx
Avatar of ferrrar

ASKER

EugeneZ, Thanks; that info about bulk insert is useful, and so is the info about t-sql, once I track down the docs and digest them.

And I'll happily give you credit for the solution if that's the way I eventually have to do it.

But still: WHY is it 100lps on one server, and 1500lps on the other, when they are similar? Imagine there's a world without these programmatic workarounds (bcp via t-sql, or bulk insert), and imagine that I can't use either. Do you have any ideas on why the one server is 15x slower than the other for doing the exact same task, and any ideas how I can fix that *without* changing the task it's doing? Is there some database setting I can change, which might fix the "95% of time spent logging" symptom I'm getting out of the database's Activity Monitor?
ASKER CERTIFIED SOLUTION
Avatar of EugeneZ
EugeneZ
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of ferrrar

ASKER

yeeeEEES!

Applying an update to MS SQL Express (SP3; I'm not sure what it was at before), increased the rows-per-second from 100rps to 11,500rps, for ODBC import. Great!

Furthermore, before I got that working, I also decided to spend the time implementing BULK INSERT as a possible workaround. It's annoying how the file has to be on the server--I wish it supported a local file--but it's certainly fast--I get 30,000 rows per second that way. Greater!

Oddly, updating the SQL Express server on the application server also solved the problem when uploading to the remote SQL Server! So maybe it wasn't the database server at all, but the driver? Or MDAC (though both versions of MDAC seemed to be the same). Anyway, inserting to the remote server with ODBC now gives me 10,000rps, which is also ample.

This completely and utterly solves my problem, twice over, and now I have lots of extra information in case I run into this again. Thanks a bunch!
Avatar of ferrrar

ASKER

The key was the SQL Server Service Pack--updating to SP3 increased performance 11,500%