[Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

SQL Update statement "apparently" duplicating rows

Posted on 2012-09-13
5
Medium Priority
?
455 Views
Last Modified: 2012-09-14
This has had me scratching my head for a while.

I have two tables ~50 columns x 350k+ rows in MSSQL Server 2000.

The two tables are almost identical in structure.

There are 3 important columns: The primary key column (“ABC_ID”), an “id” column (“id”) created by “others” and a datetime field with a getdate() default (“Created_On”). The remaining columns are a mix of char, decimal, text, datetime etc.

Table A contains “correct” data and Table B needs to be made a copy of Table A.

First I perform an Insert for all rows in Table A that are not in Table B – This works correctly.
Then I perform an Update for rows where the “id” columns are equal.

The behaviour that I am seeing is:

1.      After the Insert there are no duplicate rows with the same values for “id”
2.      After the Update there ARE duplicate rows with the same “id” but unique ABC_IDs
3.      The duplicate rows have the SAME datetime value in “Created_On”

So, can anyone suggest what’s going on here? Naturally, I don’t want these duplications. The duplicate rows look identical with the exception of the PK.

For completeness I should mention that I also perform a Delete for rows in Table B but missing from Table A. Although I have excluded this operation from having any influence on the observed behaviour.

If I have missed some vital info I apologise.

Thanks in advance!
0
Comment
Question by:Austrian_Des
  • 2
  • 2
5 Comments
 
LVL 12

Expert Comment

by:Jared_S
ID: 38395777
It kind of sounds like your update statement is updating where ABC_IDs match in the two tables when it should be matching on ABC_IDs and "id".
0
 

Author Comment

by:Austrian_Des
ID: 38396143
Hi Jared,

Thanks for your reply.

The match is on the "id" for both tables. The ABC_ID only exists in Table B (I created it to help me track "when" the rows were being created. Also the "Created_On" field for a similar purpose.

Could you clarify one part of your answer: Are you saying that it is possible for an incorrect join in an update statement to result in the creation of new rows? I'm interested to understand this as my understanding is that this wouldn't happen - but perhaps that's why I'm not an expert ;-)

Des
0
 
LVL 12

Assisted Solution

by:Jared_S
Jared_S earned 1000 total points
ID: 38396297
How are your row counts for Table B affected by the update?

Can you post your update statement?
0
 
LVL 71

Accepted Solution

by:
Qlemo earned 1000 total points
ID: 38396932
You can't insert new rows by using an UPDATE, so you probably update more than one row per ID - though I have no idea why, without seeing your SQL.

Your sequence (INSERT missing rows, UPDATE all rows with same ID) is ok if you only insert basic data (like the key columns). Otherwise you update too many rows, though nothing has changed, as you will update the new records in addition.
0
 

Author Closing Comment

by:Austrian_Des
ID: 38397901
Jared: Your question about row counts at first seemed irrelevant, but as I had hit a wall, I figured “Don’t ask a question and then ignore the help!” So, I started digging into this and the answers were very revealing. I truncated Table B and then ran the Insert code. The row count was 352992. I then checked for duplicate “id”s and there were none. I then ran the Update code and the row count was 352992 (the same) but now I had 500+ duplicate “id”s … How could this be?

To determine duplicates I used the following code:

SELECT     id, COUNT(id), field_x
FROM         Table B
GROUP BY id, field_x
HAVING      (COUNT(id) > 1)

For the sake of this example let’s say field_x is the invoice number. So my code was designed to show any invoice that contained duplicate “id”s. After the truncate, Insert process the above code returned Zero rows. After the update it showed 500+.

What I had failed to realise was that in Table A there were 500+ duplicate “id”s but with DIFFERENT invoice numbers. After the Update the invoice numbers (field_x) had been modified to show the same number.

Starting from scratch, I ran the truncate, Insert cycle and then used:

SELECT     id, COUNT(id)
FROM         Table B
GROUP BY id
HAVING      (COUNT(id) > 1)

And sure enough, up came the duplicates!

Qlemo: I’m glad that you were able to confirm that I can’t insert new rows using an Update. Also, I will modify my code to execute in the following sequence: Delete, Update, Insert – A good point about unnecessary updates! Finally, your comment, “you probably update more than one row per ID” was spot on, I just couldn’t see it!

Hope you are both OK with a split of the points?

Many thanks,

Des
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article shows gives you an overview on SQL Server 2016 row level security. You will also get to know the usages of row-level-security and how it works
It is possible to export the data of a SQL Table in SSMS and generate INSERT statements. It's neatly tucked away in the generate scripts option of a database.
Familiarize people with the process of utilizing SQL Server functions from within Microsoft Access. Microsoft Access is a very powerful client/server development tool. One of the SQL Server objects that you can interact with from within Microsoft Ac…
Viewers will learn how the fundamental information of how to create a table.
Suggested Courses

868 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question