Link to home
Start Free TrialLog in
Avatar of Curtis Long
Curtis LongFlag for United States of America

asked on

SQL 2008 R2 Duplications

I have the following rows in SQL

DATE      date      Checked
APPLICANT      varchar(MAX)      Checked
REMARKS      varchar(MAX)      Checked
[CARRIER TYPE]      varchar(MAX)      Checked
QUANTITY      numeric(18, 2)      Checked
COUNT      numeric(18, 2)      Checked
CONTAINER      varchar(MAX)      Checked
GRADE      varchar(50)      Checked
[CERTIFICATE NUMBER]      numeric(18, 0)      Checked
[QUANTITY THOUSAND WEIGHT]            Checked
[QUANITIY HUNDRED WEIGHT]            Checked
TempID      int      Unchecked

I have about 700 duplication s that happened today.  I need to list them and delete them.

How can I go about doing this??
Avatar of Jim Horn
Jim Horn
Flag of United States of America image

To identify the duplicates, assuming we're talking about allcolumns, give the below T-SQL a whirl, replacing the obvious
SELECT column1, column2, etc, columnN, COUNT(ColumnN)
FROM YourTable
GROUP BY column1, column2, etc, columnN
HAVING COUNT(ColumnN) > 1

Open in new window

To delete them, you'll need to spell out the logic behind which one to delete.  Is there an identity field somewhere, delete the first one, etc.
Avatar of Curtis Long

ASKER

Is there a way to say select these columns if column 1 column 2 column 3 and column 4 are the same??
Yes, in my column1..columnN example, replace it with 1..4
SELECT column1, column2, column3, column4, COUNT(Column4)
FROM YourTable
GROUP BY column1, column2, column3, column4
HAVING COUNT(ColumnN) > 1

Open in new window

So I type the following for my Query

SELECT   DATE, APPLICANT, REMARKS, [CARRIER TYPE], QUANTITY, COUNT, CONTAINER, GRADE, [CERTIFICATE NUMBER], [QUANTITY THOUSAND WEIGHT],
                      [QUANITIY HUNDRED WEIGHT], TempID, COUNT ([CERTIFICATE NUMBER]..COUNT)
FROM         INSPECTION_v2
GROUP BY DATE, APPLICANT, REMARKS, [CARRIER TYPE], QUANTITY, COUNT, CONTAINER, GRADE, [CERTIFICATE NUMBER], [QUANTITY THOUSAND WEIGHT],
                      [QUANITIY HUNDRED WEIGHT], TempID

having COUNT ([CERTIFICATE NUMBER]..COUNT) > 1

I get this error:

Msg 4104, Level 16, State 1, Line 7
The multi-part identifier "CERTIFICATE NUMBER..COUNT" could not be bound.
Msg 4104, Level 16, State 1, Line 2
The multi-part identifier "CERTIFICATE NUMBER..COUNT" could not be bound.

Any yes, I am a noob!  ;-)
>having COUNT ([CERTIFICATE NUMBER]..COUNT) > 1
Lose the dot dot COUNT
>Any yes, I am a noob!  ;-)
That's okay, I'm a sucker for points.

btw I have an article out there on SQL Server GROUP BY expressions that deals with count and having that may be a good read
I will definitely read that.

Now, I am really confusing myself and thanks for your patience!!

If I want to list all the columns if the value in the column QUANTITY and in column [CERTIFICATE NUMBER] are duplication, what would I replace in your sample query??


Yes, in my column1..columnN example, replace it with 1..4

SELECT column1, column2, column3, column4, COUNT(Column4)
FROM YourTable
GROUP BY column1, column2, column3, column4
HAVING COUNT(ColumnN) > 1
ASKER CERTIFIED SOLUTION
Avatar of Sharath S
Sharath S
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
That works PERFECTLY!!

Actually I went from typing a big long "Im so confused" message to looking a couple things up and learning something.  :-)

One thing I AM confused about.

I tried this out on my test data base.

The query to display duplicates returned 772 items.

When I ran the delete query it deleted around 2100.

Why would I get different numbers??

Anyway, thanks so much for your help!!
772 distinct combinations of QUANTITY,[CERTIFICATE NUMBER] have duplicates. And there are a total of 2100 duplicate records.

For example,
QUANTITY [CERTIFICATE NUMBER]
100           A
100      A
200      B
100      A
200      B
300      C

In this example, distinct combinations are 2 where as duplicate records are 5.
So sorry!!  I thought I had awarded points yesterday.

Thanks for everything!!