Solved

How can I update a database table with a .Net DataTable and ignore existing records?

Posted on 2014-12-09
19
17 Views
Last Modified: 2016-06-16
I have a .Net DataTable that contains records, all of which are "added" records. The corresponding table in the database may contain millions of rows. If I attempt to simply call the "Update" method on my SqlDataAdapter, any existing records cause an exception to be raised due to a violation of the primary key constraint. I considered loading all of the physical table's records into a second DataTable instance, merging the two, and then calling the Update method on the second DataTable. This actually works exactly like I want. However, my concern is that if there are 30 billion records in the physical table, loading all of that data into a DataTable in memory could be an issue.

I considered selecting a sub-set of data from the physical table and proceeding as described above, but the construction of the sub-query has proved to be very involved and very tedious. You see, I am not working with a single known table. I am working with a DataSet that contains several hundred DataTables. Each of the DataTables maps to its own physical table. The name and schema of the tables are not known at compile time. This has to all be done at run time.

I have played with the SqlBulkCopy class but have the same issue - duplicate records raise an exception.

I don't want to have to dynamically construct queries for each table at run time. If that is the only way, so be it, but I just can't help but think that there must be a simpler solution using what Ado.Net provides.
0
Comment
Question by:Cory Miller
  • 7
  • 5
  • 4
  • +1
19 Comments
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 40490622
I would first retrieve those records from the physical table that match your datatable into a (temp) datatable.
The records of the datatable that don't match the records of the temp datatable are new, and these you can append to the physical table.
With potentially 30 bil. records, I see no other method.

/gustav
0
 
LVL 40

Expert Comment

by:Vadim Rapp
ID: 40491221
Update command should never produce primary key violations because it does not insert new records, it updates the existing ones. The only scenario where it's possible is if the table has unique key (in addition to the primary key), and updated value in your existing record violates it.

In any case, in order to bypass existing records, you need to modify Update command the adapter is using, for example:

instead of

update mytable set col1=newvalue where pk=mykey
you use
if not exists(select 1 from mytable where col1=newvalue) update mytable set col1=newvalue where pk=mykey
0
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 40491279
Can you post the table structure and tell us which field or fields are the primary key?
For a large table I would define an IDENTITY column as primary key so SQL Server engine will take care of not repeating values and you avoid the error without any extra work.
0
 

Author Comment

by:Cory Miller
ID: 40491631
I appreciate everyone's quick responses. Thank you. Please see my responses to each individual comment below:

Gustav Brock - I cannot retrieve the physical table's records into a temp DataTable because I could potentially have in excess of 30 billion records. The DataTable is a .NET in-memory data structure. That was to be my original solution until I realized the amount of data that I could potentially be dealing with.

vadimrapp1 - When I talked about "Update", I was referring to the "SqlDataAdapter.Update()" method. This method will perform INSERT, UPDATE, and DELETE operations on its corresponding physical table depending on the state of the data in the DataTable. Additionally, business requirements dictate that existing physical records not be updated. I am only allowed to insert new records.

Vitor Montalvão - Unfortunately I do not have the latitude to change the structure of the physical table. An IDENTITY column is not an option for me.
0
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 40491638
What about my first question?
Can you post the table structure and tell us which field or fields are the primary key?
0
 
LVL 40

Expert Comment

by:Vadim Rapp
ID: 40491682
> I was referring to the "SqlDataAdapter.Update()" method.

"Update" method runs Update command defined in the Properties of the adapter. Review that command.

>  Additionally, business requirements dictate that existing physical records not be updated. I am only allowed to insert new records.

While updating, the only way to hit  "violation of the primary key constraint" is if you actually try to change primary key.
0
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 40491843
> I cannot retrieve the physical table's records into a temp DataTable because I could
> potentially have in excess of 30 billion records.

That's not what I wrote:

> I would first retrieve those records from the physical table that match your datatable
> into a (temp) datatable.

Could be done with a: select id from table where id in (<list of ids>)

/gustav
0
 

Author Comment

by:Cory Miller
ID: 40492270
Gustav Brock - Thank you for clarifying your response. However, I had already considered a similar solution. But I really want to avoid going that direction.

From my original post:

I considered selecting a sub-set of data from the physical table and proceeding as described above, but the construction of the sub-query has proved to be very involved and very tedious. You see, I am not working with a single known table. I am working with a DataSet that contains several hundred DataTables. Each of the DataTables maps to its own physical table. The name and schema of the tables are not known at compile time. This has to all be done at run time.

I would have to dynamically construct queries for hundreds of tables each with compound primary keys - something like:

SELECT * FROM table WHERE Id01 in (<list of Id01's> AND Id02 in (<list of Id02's> AND Id03 in (<list of Id03's> AND... 

Open in new window


While not impossible to do, it is not exactly in line with my original question, that is:

I don't want to have to dynamically construct queries for each table at run time. If that is the only way, so be it, but I just can't help but think that there must be a simpler solution using what Ado.Net provides.
0
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 40492292
Oh, I see. I thought you could identify records by a single unique value.

Then I think your best option would be to copy your datatable to a temp table on the server and then run an update query on the server via a pass-through query or a stored procedure.

/gustav
0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 40

Expert Comment

by:Vadim Rapp
ID: 40493938
Usually, the attempts to resolve the problem without understanding what's causing it, are doomed.
0
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 40494014
Nevertheless, sometimes such attempts may help you understand what's causing the problem.

/gustav
0
 

Author Comment

by:Cory Miller
ID: 40494294
Vitor Montalvão - I don't believe that it would necessarily be helpful to post the table structure. I am working with several hundred tables with compound primary keys that may be 1 to many columns. I have to dynamically handle each table individually at run time. My code will not know the table structure at compile time.

vadimrapp1 -
"Update" method runs Update command defined in the Properties of the adapter. Review that command.
That's not exactly what that command does. I have reviewed it exhaustively and found no solution. That is why I elected to post my original question. The "Update" command on a SqlDataAdapter will attempt to execute a SQL "INSERT" if the record has been added to the source DataTable. Every single record in my DataTable objects are added records. Therefore, the SqlDataAdapter attempts to execute an INSERT for every record in the DataTable. If the record already exists in the physical table, an exception is raised.
While updating, the only way to hit  "violation of the primary key constraint" is if you actually try to change primary key.
The SqlDataAdapter is NOT doing UPDATEs. It is attempting to do INSERTs as I described above.
0
 
LVL 40

Expert Comment

by:Vadim Rapp
ID: 40494366
So that's your answer then: modify INSERT command of the adapter so it does not do anything. For example

INSERT INTO MYTABLE(MYCOLUMN) SELECT 1 WHERE 'USA'='LAND OF THE FREE'

or even

SELECT 1
0
 

Author Comment

by:Cory Miller
ID: 40499413
I appreciate everyone's input and suggestions. Unfortunately, it appears that there really is not answer to my issue (using built in ADO.NET library functions to conditionally insert a record into a physical table). Because I must move on and get this project wrapped up, I am implementing my original solution - iterate the DataRow objects in my DataTable instance, check to see if a corresponding physical record exists, and skip it if it does, otherwise go ahead and insert it. I am implementing this by extending the DbDataAdapter class and overriding the "Update" method to perform the logic I outlined above.
0
 
LVL 40

Expert Comment

by:Vadim Rapp
ID: 40499652
Please explain, why modifying InsertCommand used by the adapter so it does not really insert, is not a solution. I actually tested it by creating a small project, and it worked as expected. I was indeed wrong in the previous comment when I said that Update method only updates - you are right, it also inserts missing and deletes extra; but still, it's doing it by launching respectively specified InsertCommand and DeleteCommand, which are adapter's properties. If you change those commands from what the wizard has created, to something else, like the examples I gave above, then it will launch that command and the command will not do anything.
0
 

Author Comment

by:Cory Miller
ID: 40499735
Vadim Rapp,

If I understand correctly, you are saying that one can set the "InsertCommand" property on the DataAdapter to something that does not perform any INSERTs. That would work if I did not want to INSERT any records at all. But how do I create an "InsertCommand" that INSERTs only when there is no existing physical record. The InsertCommand must insert brand new records but must skip records that are already in the physical table. Maybe I am not understanding your suggestion. Can you attach your small project that you created for me so I can take a look? Thanks!
0
 
LVL 40

Expert Comment

by:Vadim Rapp
ID: 40499763
Will attach tomorrow morning.
0
 
LVL 40

Accepted Solution

by:
Vadim Rapp earned 500 total points
ID: 40500236
https://filedb.experts-exchange.com/incoming/ee-stuff/8398-WindowsApplication1.rar

this is not the one I created back then, but I re-created now just the adapter so you can see how InsertCommand is specified:

insert into mytable(pk) select @p1 where not exists(select 1 from mytable where pk=@p1)

Another way to accomplish the same is to create instead-of trigger on the table:

create trigger mytable_i on mytable instead of insert as
insert mytable (pk,col1,col2) select pk,col1,col2 from inserted
where not exists(select 1 from mytable where mytable.pk=inserted.pk)
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Everyone has problem when going to load data into Data warehouse (EDW). They all need to confirm that data quality is good but they don't no how to proceed. Microsoft has provided new task within SSIS 2008 called "Data Profiler Task". It solve th…
The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now