Howard Bash
asked on
SQL CLEARING DUPES
I have a table with lots of duplications and I need to clean out the dupes . I can determine the sets of dupe rows within using SQL like the following:
select key1, key2, key3, sum(1) as RecCount
from myTable
group by key1, key2, key3
having sum(1) > 1
Giving back something like:
A11, B12, C13, 2
A21, B22, C23, 14
A31, B32, C33, 50
…
…
And I hoped that I could kill the rows using something like the following:
Delete myTable T
Where T.key1, T.key2, T.key3 in
(
select key1, key2, key3, sum(1) as RecCount
from myTable
group By key1, key2, key3
having sum(1) > 1
)
Unfortunately this nor any variation I have come up with will allow me to delete the dupes.
If you think about it, the first query identifies all the rows that are dupes and groups them with their group counts. I need to get all the counts to 1.
select key1, key2, key3, sum(1) as RecCount
from myTable
group by key1, key2, key3
having sum(1) > 1
Giving back something like:
A11, B12, C13, 2
A21, B22, C23, 14
A31, B32, C33, 50
…
…
And I hoped that I could kill the rows using something like the following:
Delete myTable T
Where T.key1, T.key2, T.key3 in
(
select key1, key2, key3, sum(1) as RecCount
from myTable
group By key1, key2, key3
having sum(1) > 1
)
Unfortunately this nor any variation I have come up with will allow me to delete the dupes.
If you think about it, the first query identifies all the rows that are dupes and groups them with their group counts. I need to get all the counts to 1.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
The table has no primary key.
ASKER
I am thinking that this CTE technique and the related SQL cannot be run in a query window but as a stored proc. Is that the case?
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I have found an neater technique for deleting dupes which I will paste for your information:
DELETE FROM DATA_PLAN
WHERE rowid NOT IN ( SELECT min(rowid)
FROM DATA_PLAN
GROUP BY
PLAN ,
CUR_EFFECTIVE_DATE ,
CUR_VISIT_REASON ,
CUR_TRANS_ID ,
CHANGED_TIME ,
PICKUP_TIME ,
BATCH_STATUS
);
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I suppose you could copy all the data to a temp table do the delete, and then truncate the old table and insert the newly cleaned temp rows. Or depending on the situation, during a maintainence window, have a DBA run the SQL when the database is not available.
But, I understand your point.
But, I understand your point.
ASKER
Regarding the first link, the sql copied:
SELECT col1, col2, col3=count(*)
INTO holdkey
FROM t1
GROUP BY col1, col2
HAVING count(*) > 1
must be a typo and should be:
SELECT col1, col2, col3, count(*)
INTO holdkey
FROM t1
GROUP BY col1, col2
HAVING count(*) > 1