Solved

Datatable processing strangely slow

Posted on 2016-09-01
2
74 Views
Last Modified: 2016-09-09
Hi

I'm developing a VB.Net application, and within that app, I have a daily processing involving 2 datatables which is surprisingly slow, hence my request for help.

The first data table is pulled out of a AS/400 DB2 table which has only about 5,000 rows but a lot of columns. I load all the table, i.e. all the columns, although I only need less than 20 columns. The Load over the dataadapter is quick (les than 10 seconds). The source table has a unique key consisting of 2 small fixed length strings (7 and 3 characters respectively). The data is about financial instruments, so each instrument has this unique key and about 20 columns of interest, and about 40 which I ignore
1st question: is there a big performance hit AFTER the load when processing the table if the source table has, say, 60 columns, and I only need 20 ?

The second data table, which is also pulled from an AS/400 DB2 DB, has about 100,000 rows and is actually having partially the same data like the first one, it features the same unique instrument key like the first datatable, but instead it has 1 row for each combination of key and field, i.e. it has say 20 rows for each instrument, for 1 row of the same instrument in the first datatable.

I.e. if Instrument X (with key 1234567-001) in datatable 1 has fields A, B and C, then datable 2 has 3 rows for instrument X:
- row 1 with key 1234567-001 and field name of field A and field value of field A
- row 2 with key 1234567-001 and field name of field B and field value of field B
- row 3 with key 1234567-001 and field name of field C and field value of field C

Both loads are done like this:
            Dim adapterN As New OdbcDataAdapter(selectCmd, connDB2)
            adapterN.SelectCommand.CommandText = selectCmd
            adapterN.Fill(dataset, "TableN")
where N is 1 or 2

The processing is about comparing the data from datatable 1, which is always up to date, with the data in datable 2, which has to be updated to be snychronised with datatable 1. I.e. the processing:
- reads each row from datatable 1
- loops on each of the 20 relevant fields of the row
- compares the field values in datatable 2 in each of the corresponding datatable 2 rows with the value in datatable 1, and if they differ:
  - updates the field value in datatable 2 if it finds an entry for that field in datatable 2
  - inserts a new row in datatable 2 if it does not find an entry for that field in datatable 2

Both these cases (update and insert) are pretty rare, because the financial instrument data does not change much from one day to the other, as it is mainly "base" or static data such as name and issuer and issue date and size and stuff like that. So I was expecting a pretty fast process, but it took HOURS when I ran that synchronization process the first time. So it seems that just the compare process, which involves selecting the right entry in the second dadatable based on the unique key and the field name in the first datatable, is very slow, and I can't really see why, as everything is happening in memory with datatables so i'd hoped it would be lightning fast.

Can you see reasons why such a compare process should be slow ? How can it be made faster ?

Thanks for help, hints and tips...
Bernard
0
Comment
Question by:bthouin
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 12

Accepted Solution

by:
funwithdotnet earned 500 total points
ID: 41779898
If you do a DataTable.Select(myFIlter) on a large DataTable, the behavior you describe is expected.

Load the first table & loop thru the rows. On each loop, grab the row key and fill DataTable #2 with row- relevant data and do your processing. That should be a lot faster.

Rule-of-thumb: Never filter DataTables (except tiny ones) when you can have the database do it . The database is much better at it.
0
 
LVL 1

Author Comment

by:bthouin
ID: 41780770
Very interesting, I never knew that datatables were only really  good when you didn't have to filter.  Thanks, I will change the processing and see what happens ! Let me get back to you.
0

Featured Post

[Webinar] Code, Load, and Grow

Managing multiple websites, servers, applications, and security on a daily basis? Join us for a webinar on May 25th to learn how to simplify administration and management of virtual hosts for IT admins, create a secure environment, and deploy code more effectively and frequently.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This tutorial demonstrates one way to create an application that runs without any Forms but still has a GUI presence via an Icon in the System Tray. The magic lies in Inheriting from the ApplicationContext Class and passing that to Application.Ru…
I think the Typed DataTable and Typed DataSet are very good options when working with data, but I don't like auto-generated code. First, I create an Abstract Class for my DataTables Common Code.  This class Inherits from DataTable. Also, it can …
Nobody understands Phishing better than an anti-spam company. That’s why we are providing Phishing Awareness Training to our customers. According to a report by Verizon, only 3% of targeted users report malicious emails to management. With compan…
Are you ready to implement Active Directory best practices without reading 300+ pages? You're in luck. In this webinar hosted by Skyport Systems, you gain insight into Microsoft's latest comprehensive guide, with tips on the best and easiest way…

734 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question