Link to home
Start Free TrialLog in
Avatar of Jay Roy
Jay RoyFlag for United States of America

asked on

sql server delete sql

hi guys

I have my Table customer with data like
count   ID      VERSION       TYPE       
1        126      4           SHOE
3        126      4           SHOE
4        127      6           SHOE
5        127      6           SHOE
6        128      7           SHOE
7        128      7           SHOE
9        128      7           SHOE

I am trying to come up with a sql which will check if ID and VERSION for two or more rows are same for the TYPE = SHOE, then delete the row(s)
with higher count and keep only one row with lower count.
So finally my result set should look like
count   ID        VERSION      TYPE       
1        126      4           SHOE
4        127      6           SHOE
6        128      7           SHOE

trying to come up with the sql but stuck..

delete from customer where TYPE = 'SHOE' and
..

any help will be greatly appreciated.
Thanks.
ASKER CERTIFIED SOLUTION
Avatar of chaau
chaau
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Jay Roy

ASKER

Thanks
Don't I have to use VERSION along with ID in partition clause?
You are right, I have missed it. Here is the correct query:
;with dup as (
SELECT *,
  ROW_NUMBER() OVER (PARTITION BY [TYPE], [ID], [VERSION] ORDER BY [count]) rn
  FROM Table1
  WHERE [TYPE] = 'SHOE')
DELETE FROM dup where rn > 1;

Open in new window

Avatar of Jay Roy

ASKER

I think I got it. You are assigning a rank and deleting ranks >1
Is therr any other way.
Thanks.
Yes, there are other ways too, but this one is the easiest (and fastest in my opinion). Do you wish to explore them?
Avatar of Jay Roy

ASKER

yeah, I was curious to know if we can do it using one single delete statement  instead of a Select ..Delete

Thanks.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
@Thandava Vallepalli: Your statement will not work for the cases with duplicate count values. I.e. if there are two or more records with count=4 for ID=126, they all will be left and not deleted
There is no way in my opinion to delete is using a single delete. There must be a select. However, you will be able to insert records to a temp table using this statement:
select [ID], [version], [type], min(Cnt) Cnt
INTO #Temp
from TabVer
group by [ID], [version], [type]

Open in new window

then truncate the original table and re-insert the records from the temp table.
TRUNCATE TabVer
INSERT INTO TabVer ([ID], [version], [type], Cnt)
SELECT [ID], [version], [type], Cnt
FROM #Temp

Open in new window

@Chaau,

Generally most of the tables in RDMBS contain primary key.  I think, [Cnt] is a primary key.
If [Cnt] is not primary key, here is another single delete statement which takes care of duplicate [Cnt] as well.

=========================================
delete t1
from TabVer t1
            inner join (
                                    select row_number() over (partition by [ID], [version], [type] order by [Cnt]) as row_id
                                                ,[Cnt]
                                                ,[ID]
                                                ,[Version]
                                                ,[Type]
                                    from TabVer t2
                                    where t2.[Type] = 'SHOE'
                              ) t3
                        on t1.cnt = t3.cnt
                              and t1.[id] = t3.[id]
                              and t1.[version] = t3.[version]
                              and t1.[type] = t3.[type]
where t1.[Type] = 'SHOE'
            and row_id > 1;
======================================

--itsvtk
@Chaau,

Quick advice....Never ever use truncate table in production database.  
In case of mistakes in queires, we will loose entire data and can't rollback it.


--itsvtk
Hi,

You can try this ..

; WITH CTE AS
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY id,version ORDER BY counts,id) CTN FROM  [dbo].[T3323]
) 
--SELECT *  FROM CTE WHERE CTN > 1
DELETE FROM CTE WHERE CTN > 1

Open in new window


And we can't write direct delete without select since
Windowed functions can only appear in the SELECT or ORDER BY clauses
@Vikas,

You can write single delete statement (without using window funciton) to delete duplicates from a table, if you have a primary key in that table (I think, [Cnt] is primary key).  Please check my first answer above for single delete statement.

If you don't have primary key in a table, you can still delete duplicate records using window function in a single delete statement. Please check my second answer above.


--itsvtk
@Thandava,

Thanks for you valuable comment but in your second example is the Window function is used with Select Statement
(
select row_number() over (partition by [ID], [version], [type] order by [Cnt]) as row_id
                                                ,[Cnt]
)

and that is what my comment says that without Select you can't use Windows Function..

Thanks again......
@Vikas,

The select statement with window function is required only if a table doesn't have a primary key.  
Most of the cases, in RDBDS, every table contains a primary key / composite primary key.

I assumed that [Cnt] is the primary key column in customer table. If my assumption is correct, then no need of window function.

Here is one statement without any window function to delete duplicates.

==============================================================
DELETE t1
FROM TabVer t1
WHERE Cnt > ( SELECT min(Cnt)
                        FROM TabVer t2
                        WHERE t1.[ID] = t2.[ID]
                                    AND t1.[version] = t2.[version]
                                    AND t1.[type] = t2.[type]
                  )
            AND t1.[Type] = 'SHOE';
=======================================================



If [Cnt] is not a primary key, then have to use window function. But still it is a single statement which deletes duplicates, bit complicated though.
=======================================================
DELETE t1
FROM TabVer t1
            INNER JOIN (
                                    SELECT ROW_NUMBER() OVER (PARTITION BY [ID], [version], [type] ORDER BY [Cnt]) AS row_id
                                                ,[Cnt]
                                                ,[ID]
                                                ,[Version]
                                                ,[Type]
                                    FROM TabVer t2
                                    WHERE t2.[Type] = 'SHOE'
                              ) t3
                        ON t1.cnt = t3.cnt
                              AND t1.[id] = t3.[id]
                              AND t1.[version] = t3.[version]
                              AND t1.[type] = t3.[type]
WHERE t1.[Type] = 'SHOE'
            and row_id > 1;
=======================================================


--itsvtk
If count can have duplicates, you must use a Windowing function to do this with any efficiency at all.


Btw:
>> Never ever use truncate table in production database.  
In case of mistakes in queries, we will lose entire data and can't rollback it. <<

FALSE: Truncate CAN be rolled back.
Thanks for the clarification @ScottPletcher

--itsvtk
Avatar of Jay Roy

ASKER

thanks all for the answers.