We'd like to run a query like the one below to find duplicate entries in our recordset, save those records to a separate table, and update the source table with a flag to mark the record as found.
Currently, we're doing all 3 outside access and one at a time:
Query 1) SELECT * FROM MyTable AS mt WHERE (((Exists (SELECT ColumnA, ColumnB, ColumnC FROM MyTable AS a WHERE a.ColumnA = mt.ColumnA AND ABS(a.ColumnB) = ABS(mt.ColumnB) AND a.ColumnC = mt.Columnc GROUP BY a.ColumnA, a.ColumnB, a.ColumnC HAVING Count(*)>1))<>False)) ORDER BY mt.ColumnA, mt.ColumnB, mt.ColumnC;
Query 2) loop through all records returned from Query 1 and insert those records in a new table, but process Query 3 below before moving on to next record from Query 1. There will be some blank/null values for ColumnG so I'm not sure what if/then language is available/needed. INSERT INTO dup1b_report ( MyTableID, ColumnA, ColumnB, ColumnC, ColumnD, ColumnE, ColumnF <if Len(ColumnG) GTE 1>, ColumnG</if>) VALUES ( QUERY1.ID Value, QUERY1.ColumnA Value, QUERY1.ColumnB Value, QUERY1.ColumnC Value, ' QUERY1.ColumnD TextValue', QUERY1.ColumnE DateValue, QUERY1.ColumnF Value <if Len(ColumnG) GTE 1>, ColumnG Value</if>)
Query 3) Record that this loops record was identified in the original source table, then go on to the next record returned by Query 1. UPDATE MyTable SET MyTable.dup = 1 WHERE MyTable.ID = Query1.ID Value
Can MySQL do this all by itself? If so how? Can it handle running on large record sets? (10's and 100's of millions of rows) We also have a full copy of SQL Server 2005 standard if that is a better platform for handling this type of issue but we REALLY don't know that software.