Datatable processing strangely slow

Posted on 2016-09-01
Last Modified: 2016-09-09

I'm developing a VB.Net application, and within that app, I have a daily processing involving 2 datatables which is surprisingly slow, hence my request for help.

The first data table is pulled out of a AS/400 DB2 table which has only about 5,000 rows but a lot of columns. I load all the table, i.e. all the columns, although I only need less than 20 columns. The Load over the dataadapter is quick (les than 10 seconds). The source table has a unique key consisting of 2 small fixed length strings (7 and 3 characters respectively). The data is about financial instruments, so each instrument has this unique key and about 20 columns of interest, and about 40 which I ignore
1st question: is there a big performance hit AFTER the load when processing the table if the source table has, say, 60 columns, and I only need 20 ?

The second data table, which is also pulled from an AS/400 DB2 DB, has about 100,000 rows and is actually having partially the same data like the first one, it features the same unique instrument key like the first datatable, but instead it has 1 row for each combination of key and field, i.e. it has say 20 rows for each instrument, for 1 row of the same instrument in the first datatable.

I.e. if Instrument X (with key 1234567-001) in datatable 1 has fields A, B and C, then datable 2 has 3 rows for instrument X:
- row 1 with key 1234567-001 and field name of field A and field value of field A
- row 2 with key 1234567-001 and field name of field B and field value of field B
- row 3 with key 1234567-001 and field name of field C and field value of field C

Both loads are done like this:
            Dim adapterN As New OdbcDataAdapter(selectCmd, connDB2)
            adapterN.SelectCommand.CommandText = selectCmd
            adapterN.Fill(dataset, "TableN")
where N is 1 or 2

The processing is about comparing the data from datatable 1, which is always up to date, with the data in datable 2, which has to be updated to be snychronised with datatable 1. I.e. the processing:
- reads each row from datatable 1
- loops on each of the 20 relevant fields of the row
- compares the field values in datatable 2 in each of the corresponding datatable 2 rows with the value in datatable 1, and if they differ:
  - updates the field value in datatable 2 if it finds an entry for that field in datatable 2
  - inserts a new row in datatable 2 if it does not find an entry for that field in datatable 2

Both these cases (update and insert) are pretty rare, because the financial instrument data does not change much from one day to the other, as it is mainly "base" or static data such as name and issuer and issue date and size and stuff like that. So I was expecting a pretty fast process, but it took HOURS when I ran that synchronization process the first time. So it seems that just the compare process, which involves selecting the right entry in the second dadatable based on the unique key and the field name in the first datatable, is very slow, and I can't really see why, as everything is happening in memory with datatables so i'd hoped it would be lightning fast.

Can you see reasons why such a compare process should be slow ? How can it be made faster ?

Thanks for help, hints and tips...
Question by:bthouin
LVL 12

Accepted Solution

funwithdotnet earned 500 total points
ID: 41779898
If you do a DataTable.Select(myFIlter) on a large DataTable, the behavior you describe is expected.

Load the first table & loop thru the rows. On each loop, grab the row key and fill DataTable #2 with row- relevant data and do your processing. That should be a lot faster.

Rule-of-thumb: Never filter DataTables (except tiny ones) when you can have the database do it . The database is much better at it.

Author Comment

ID: 41780770
Very interesting, I never knew that datatables were only really  good when you didn't have to filter.  Thanks, I will change the processing and see what happens ! Let me get back to you.

Featured Post

Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Parsing a CSV file is a task that we are confronted with regularly, and although there are a vast number of means to do this, as a newbie, the field can be confusing and the tools can seem complex. A simple solution to parsing a customized CSV fi…
If you need to start windows update installation remotely or as a scheduled task you will find this very helpful.
Email security requires an ever evolving service that stays up to date with counter-evolving threats. The Email Laundry perform Research and Development to ensure their email security service evolves faster than cyber criminals. We apply our Threat…
The Email Laundry PDF encryption service allows companies to send confidential encrypted  emails to anybody. The PDF document can also contain attachments that are embedded in the encrypted PDF. The password is randomly generated by The Email Laundr…

773 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question