Solved

Import data from text file to mdb file

Posted on 2004-08-15
18
483 Views
Last Modified: 2008-02-01
I have a text file with 20000 records, 6 fields each record.
What is the fastest way (in execution time) to import the data into an existing table?
I tryed this code, but it takes more then 2 minutes to add all records.

const TransactionSize as integer = 1000

while MoreRecords
    dr = DataSet.Tables(0).NewRow()
    dr(0) = .....
    dr(1) = .....

    DataSet.Tables(0).Rows.Add(dr)
    RecordsCount = RecordsCount+1

    if RecordsCount = TransactionSize then
           DataAdapter.Update(DataSet)
           DataSet.AcceptChanges()
           RecordsCount = 0
    end if
end while
0
Comment
Question by:EyalL
  • 7
  • 6
  • 3
  • +2
18 Comments
 
LVL 27

Expert Comment

by:Dabas
Comment Utility
Hi EyalL:
Part of your problem is that you are using DataAdapter.Update(DataSet) INSIDE your loop.

You will see a large decrease in execution time if you move this line to below end while.
You also will not need the DataSet.AcceptChanges at all

Dabas
0
 

Author Comment

by:EyalL
Comment Utility
The DataAdapter.Update runs for every 1000 records and not for each record.
Pay attention to the 'if' statement and the constant value.

When I'm running the Update outside the loop - it takes about 2 minutes to execute. Same as if I'm running it after every 1000 records.

Eyal.
0
 
LVL 27

Expert Comment

by:Dabas
Comment Utility
EyalL:
You are right, I did not pay attention. Apologies.
What if you drop the whole DataAdapter, and use a command object with the corresponding INSERT command instead?
It would be inserting each row separately, but might be saving time by not using the unecessary DataAdapter overhead.


Dabas
0
 

Author Comment

by:EyalL
Comment Utility
I have tried this:

SqlCmd = New OleDbCommand("INSERT INTO table (a,b,c,d,e,f) values (?,?,?,?,?,?)", DbConn)

SqlCmd.Parameters.Add("@p1", Type1)
SqlCmd.Parameters.Add("@p2", Type2)
SqlCmd.Parameters.Add("@p3", Type3)
SqlCmd.Parameters.Add("@p4", Type4)
SqlCmd.Parameters.Add("@p5", Type5)
SqlCmd.Parameters.Add("@p6", Type6)

while MoreRecords
    SqlCmd.Parameters(0).Value = Value1
    SqlCmd.Parameters(1).Value = Value2
    .
    .

    SqlCmd.ExecuteNonQuery()
    RecordsCount = RecordsCount+1
End While


Still takes the same time to execute.
0
 
LVL 27

Expert Comment

by:Dabas
Comment Utility
EyalL,
OK.
I do not think that we can improve anything in what code is concerned.
There are other issues that can affect performance.
For example your connection. The CPU speed of both client and server. The network card and connection, etc.

Have you tried running the same routine from another computer, for example?

Dabas
0
 

Author Comment

by:EyalL
Comment Utility
The mdb file is placed on the same computer that runs the program so no connection or network is involved.

I tried to paste the same amount of records to the same table in the same mdb file using MS access and it goes much faster - takes about 15 seconds, which is reasonable for 20000 records.

I don't see any reason that MS access will do it 10 times faster then VB code.

any ideas?
0
 
LVL 27

Expert Comment

by:Dabas
Comment Utility
EyalL,
Sorry. No further ideas. Maybe a 20 points question in the VB Database area with a link to this question might attract a Database expert to voice his opinion.

Dabas
0
 
LVL 18

Expert Comment

by:JR2003
Comment Utility
It could be because there is an index on the table.

If you do the following it may be quicker:
1. Drop the indexes on the table you are inserting.
2. Insert the records
3. Recreate the indexes.

How many records total are there in the table you are inserting onto.
0
 
LVL 29

Expert Comment

by:leonstryker
Comment Utility
I think the fastest way would be to let Access handle all the work.  Create an instance of Access and let it pull in the text data without using ADO.NET.

Take a look here to see how it is done directly in Access:

http://www.experts-exchange.com/Databases/MS_Access/Q_11873739.html

Leon
0
Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

 

Author Comment

by:EyalL
Comment Utility
JR2003,
Removing all relashionships and indexes doesn't help.

leonstryker,
The customer computers doesn't have MS Access installed...

I can't understand why it takes so much time to import 20,000 records. There must be a way to do it faster...
0
 
LVL 29

Accepted Solution

by:
leonstryker earned 250 total points
Comment Utility
Ok, what about this:

http://support.microsoft.com/default.aspx?scid=http://support.microsoft.com:80/support/kb/articles/Q262/5/37.ASP&NoWebContent=1

The part which may be useful is this one:

cn.Execute "INSERT INTO Table1 SELECT * FROM " & _
             "[Text;Database=" & PathtoTextFile & ";HDR=YES].[TextFile.txt]"

Leon
0
 

Author Comment

by:EyalL
Comment Utility
leonstryker,

It looks much better now, but my text file doesn't have any delimiters or commas. Only fixed size fields.

The file has the following format on each line:

field1: 10 digits number
field2: 6 characters (not only ASCII codes).
field3: 4 digits number

Example:
0000000972 5■j╞94 1999

Is it possible to import the file without re-order the text file? Can I specify the structure of the text file in the SQL statement?

Thanks,
Eyal.
0
 
LVL 29

Expert Comment

by:leonstryker
Comment Utility
Honestly, I do not know.  If it is possible, you would probably need something like like schema.ini:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/htm/odbcjetschema_ini_file.asp

Leon
0
 
LVL 27

Expert Comment

by:Dabas
Comment Utility
Eyal:

Maybe the problem is that using DataSets is what is slowing you down?

I just stumbled on the following article, that might suggest a different way of doing things

http://www.fawcette.com/reports/vsliveor/2003/nilsson3/

Here is part of a relevant quote:

----Quote----
Assume you need to send some orders in a data container. If you choose a DataSet as the data container, it will—according to my tests—take approximately three times as long as if you had used a custom order collection as the data container. And that is end to end, meaning it includes the time to fetch the order from the database and the time to inspect each value in the order at the consumer side.

The biggest reason for this is probably that the DataSet is serialized as XML even when a binary formatter is used, so the DataSet will be approximately five times as large as the custom collection over the wire.
----UnQuote----

HTH

Dabas

0
 

Author Comment

by:EyalL
Comment Utility
leomstryker,

I don't think that schema.ini can work with OleDB.

Anyway, I created a new text file based on the data from the original text file and imported the new one. It really works very fast (few seconds for all the records), but...

In order to do it right, without creating new text file every import, I need to know how to import the original text file directly.

Dabas,
I'm not using DataSet...

thanks,
Eyal.
0
 
LVL 27

Expert Comment

by:Dabas
Comment Utility
Eyal.

while MoreRecords
    dr = DataSet.Tables(0).NewRow()
    dr(0) = .....
    dr(1) = .....

This is your code as per your question.
What are you using then?

Dabas
0
 

Author Comment

by:EyalL
Comment Utility
Dabas,

Please look at your second comment and my reply to that comment.
(You have suggested to drop the DataAdapter and use the Command Object).
0
 
LVL 45

Expert Comment

by:aikimark
Comment Utility
Eyal,

I'm interested in the amount of time required by your code to read and parse the text file.

I'd recommend reading the entire text file into an array, recording the start and stop time of this read/parse process.  20000 records isn't too much to store in RAM on most PCs.

Questions:
Q1. Are you reading the text file a line-at-a-time or all at once?
Q2. What statements are you using to parse the text file?

==================================
Suggestions:
S1. Read/parse the data and save it as a delimited or fixed-length field file.  Use standard database importing functions.
S2. Make sure you use native datatypes for your variables (avoid variants when possible).
S3. As stated earlier, look at your index, trigger, and inter-table relationships as being impediments to performance.
S4. Do the appending in two steps...Insert into an empty table with no indexes.  Then execute an Insert Into ... Select ... (append SQL) statement to append the data into the big table.
S5. Read the text file with one operation, rather than multiple I/O operations.


0

Featured Post

What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

Join & Write a Comment

Microsoft Reports are based on a report definition, which is an XML file that describes data and layout for the report, with a different extension. You can create a client-side report definition language (*.rdlc) file with Visual Studio, and build g…
If you need to start windows update installation remotely or as a scheduled task you will find this very helpful.
This video shows how to remove a single email address from the Outlook 2010 Auto Suggestion memory. NOTE: For Outlook 2016 and 2013 perform the exact same steps. Open a new email: Click the New email button in Outlook. Start typing the address: …
This video explains how to create simple products associated to Magento configurable product and offers fast way of their generation with Store Manager for Magento tool.

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

6 Experts available now in Live!

Get 1:1 Help Now