Solved

Import data from text file to mdb file

Posted on 2004-08-15
18
488 Views
Last Modified: 2008-02-01
I have a text file with 20000 records, 6 fields each record.
What is the fastest way (in execution time) to import the data into an existing table?
I tryed this code, but it takes more then 2 minutes to add all records.

const TransactionSize as integer = 1000

while MoreRecords
    dr = DataSet.Tables(0).NewRow()
    dr(0) = .....
    dr(1) = .....

    DataSet.Tables(0).Rows.Add(dr)
    RecordsCount = RecordsCount+1

    if RecordsCount = TransactionSize then
           DataAdapter.Update(DataSet)
           DataSet.AcceptChanges()
           RecordsCount = 0
    end if
end while
0
Comment
Question by:EyalL
  • 7
  • 6
  • 3
  • +2
18 Comments
 
LVL 27

Expert Comment

by:Dabas
ID: 11808552
Hi EyalL:
Part of your problem is that you are using DataAdapter.Update(DataSet) INSIDE your loop.

You will see a large decrease in execution time if you move this line to below end while.
You also will not need the DataSet.AcceptChanges at all

Dabas
0
 

Author Comment

by:EyalL
ID: 11808845
The DataAdapter.Update runs for every 1000 records and not for each record.
Pay attention to the 'if' statement and the constant value.

When I'm running the Update outside the loop - it takes about 2 minutes to execute. Same as if I'm running it after every 1000 records.

Eyal.
0
 
LVL 27

Expert Comment

by:Dabas
ID: 11809051
EyalL:
You are right, I did not pay attention. Apologies.
What if you drop the whole DataAdapter, and use a command object with the corresponding INSERT command instead?
It would be inserting each row separately, but might be saving time by not using the unecessary DataAdapter overhead.


Dabas
0
Networking for the Cloud Era

Join Microsoft and Riverbed for a discussion and demonstration of enhancements to SteelConnect:
-One-click orchestration and cloud connectivity in Azure environments
-Tight integration of SD-WAN and WAN optimization capabilities
-Scalability and resiliency equal to a data center

 

Author Comment

by:EyalL
ID: 11815066
I have tried this:

SqlCmd = New OleDbCommand("INSERT INTO table (a,b,c,d,e,f) values (?,?,?,?,?,?)", DbConn)

SqlCmd.Parameters.Add("@p1", Type1)
SqlCmd.Parameters.Add("@p2", Type2)
SqlCmd.Parameters.Add("@p3", Type3)
SqlCmd.Parameters.Add("@p4", Type4)
SqlCmd.Parameters.Add("@p5", Type5)
SqlCmd.Parameters.Add("@p6", Type6)

while MoreRecords
    SqlCmd.Parameters(0).Value = Value1
    SqlCmd.Parameters(1).Value = Value2
    .
    .

    SqlCmd.ExecuteNonQuery()
    RecordsCount = RecordsCount+1
End While


Still takes the same time to execute.
0
 
LVL 27

Expert Comment

by:Dabas
ID: 11815626
EyalL,
OK.
I do not think that we can improve anything in what code is concerned.
There are other issues that can affect performance.
For example your connection. The CPU speed of both client and server. The network card and connection, etc.

Have you tried running the same routine from another computer, for example?

Dabas
0
 

Author Comment

by:EyalL
ID: 11818376
The mdb file is placed on the same computer that runs the program so no connection or network is involved.

I tried to paste the same amount of records to the same table in the same mdb file using MS access and it goes much faster - takes about 15 seconds, which is reasonable for 20000 records.

I don't see any reason that MS access will do it 10 times faster then VB code.

any ideas?
0
 
LVL 27

Expert Comment

by:Dabas
ID: 11826634
EyalL,
Sorry. No further ideas. Maybe a 20 points question in the VB Database area with a link to this question might attract a Database expert to voice his opinion.

Dabas
0
 
LVL 18

Expert Comment

by:JR2003
ID: 11829906
It could be because there is an index on the table.

If you do the following it may be quicker:
1. Drop the indexes on the table you are inserting.
2. Insert the records
3. Recreate the indexes.

How many records total are there in the table you are inserting onto.
0
 
LVL 29

Expert Comment

by:leonstryker
ID: 11831735
I think the fastest way would be to let Access handle all the work.  Create an instance of Access and let it pull in the text data without using ADO.NET.

Take a look here to see how it is done directly in Access:

http://www.experts-exchange.com/Databases/MS_Access/Q_11873739.html

Leon
0
 

Author Comment

by:EyalL
ID: 11841973
JR2003,
Removing all relashionships and indexes doesn't help.

leonstryker,
The customer computers doesn't have MS Access installed...

I can't understand why it takes so much time to import 20,000 records. There must be a way to do it faster...
0
 
LVL 29

Accepted Solution

by:
leonstryker earned 250 total points
ID: 11842194
Ok, what about this:

http://support.microsoft.com/default.aspx?scid=http://support.microsoft.com:80/support/kb/articles/Q262/5/37.ASP&NoWebContent=1

The part which may be useful is this one:

cn.Execute "INSERT INTO Table1 SELECT * FROM " & _
             "[Text;Database=" & PathtoTextFile & ";HDR=YES].[TextFile.txt]"

Leon
0
 

Author Comment

by:EyalL
ID: 11842791
leonstryker,

It looks much better now, but my text file doesn't have any delimiters or commas. Only fixed size fields.

The file has the following format on each line:

field1: 10 digits number
field2: 6 characters (not only ASCII codes).
field3: 4 digits number

Example:
0000000972 5■j╞94 1999

Is it possible to import the file without re-order the text file? Can I specify the structure of the text file in the SQL statement?

Thanks,
Eyal.
0
 
LVL 29

Expert Comment

by:leonstryker
ID: 11843409
Honestly, I do not know.  If it is possible, you would probably need something like like schema.ini:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/htm/odbcjetschema_ini_file.asp

Leon
0
 
LVL 27

Expert Comment

by:Dabas
ID: 11857556
Eyal:

Maybe the problem is that using DataSets is what is slowing you down?

I just stumbled on the following article, that might suggest a different way of doing things

http://www.fawcette.com/reports/vsliveor/2003/nilsson3/

Here is part of a relevant quote:

----Quote----
Assume you need to send some orders in a data container. If you choose a DataSet as the data container, it will—according to my tests—take approximately three times as long as if you had used a custom order collection as the data container. And that is end to end, meaning it includes the time to fetch the order from the database and the time to inspect each value in the order at the consumer side.

The biggest reason for this is probably that the DataSet is serialized as XML even when a binary formatter is used, so the DataSet will be approximately five times as large as the custom collection over the wire.
----UnQuote----

HTH

Dabas

0
 

Author Comment

by:EyalL
ID: 11857993
leomstryker,

I don't think that schema.ini can work with OleDB.

Anyway, I created a new text file based on the data from the original text file and imported the new one. It really works very fast (few seconds for all the records), but...

In order to do it right, without creating new text file every import, I need to know how to import the original text file directly.

Dabas,
I'm not using DataSet...

thanks,
Eyal.
0
 
LVL 27

Expert Comment

by:Dabas
ID: 11858049
Eyal.

while MoreRecords
    dr = DataSet.Tables(0).NewRow()
    dr(0) = .....
    dr(1) = .....

This is your code as per your question.
What are you using then?

Dabas
0
 

Author Comment

by:EyalL
ID: 11858079
Dabas,

Please look at your second comment and my reply to that comment.
(You have suggested to drop the DataAdapter and use the Command Object).
0
 
LVL 45

Expert Comment

by:aikimark
ID: 11862485
Eyal,

I'm interested in the amount of time required by your code to read and parse the text file.

I'd recommend reading the entire text file into an array, recording the start and stop time of this read/parse process.  20000 records isn't too much to store in RAM on most PCs.

Questions:
Q1. Are you reading the text file a line-at-a-time or all at once?
Q2. What statements are you using to parse the text file?

==================================
Suggestions:
S1. Read/parse the data and save it as a delimited or fixed-length field file.  Use standard database importing functions.
S2. Make sure you use native datatypes for your variables (avoid variants when possible).
S3. As stated earlier, look at your index, trigger, and inter-table relationships as being impediments to performance.
S4. Do the appending in two steps...Insert into an empty table with no indexes.  Then execute an Insert Into ... Select ... (append SQL) statement to append the data into the big table.
S5. Read the text file with one operation, rather than multiple I/O operations.


0

Featured Post

Free Tool: Postgres Monitoring System

A PHP and Perl based system to collect and display usage statistics from PostgreSQL databases.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction When many people think of the WebBrowser (http://msdn.microsoft.com/en-us/library/2te2y1x6%28v=VS.85%29.aspx) control, they immediately think of a control which allows the viewing and navigation of web pages. While this is true, it's a…
It was really hard time for me to get the understanding of Delegates in C#. I went through many websites and articles but I found them very clumsy. After going through those sites, I noted down the points in a easy way so here I am sharing that unde…
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…

830 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question