Link to home
Start Free TrialLog in
Avatar of pda4me
pda4me

asked on

MySQL remove duplicate based on columns

I have a table with multiple duplicate records.  Each has a column called f95 that can have either BEAR, FMFL or NAPLES as the data.

How do I craft a query to compare the following fields:

STREET_NUMBER, STREET_NAME, STREET_TYPE, STREET_DIRECTION, UNIT_NUMBER, CITY, ZIP_CODE

and then remove the records based on if the value is FMFL I always keep and remove the others, if not FMFL then keep NAPLES and remove other duplicates

Please show me the correct SQL query to do this.
Avatar of Kent Olsen
Kent Olsen
Flag of United States of America image

Hi Pda,

Is there a primary key on the rows?  Perhaps an auto_increment column?


Kent
Avatar of pda4me
pda4me

ASKER

yes, its called bID and is INT 12 auto_increment
I'm not 100% sure how to do the delete, but here is a query that will get you the information you need.

If it were me, I would loop through this query and do individual deletes via PHP or an alternate language. It's possible it could be done via a Stored Procedure. I'm not confident that MySQL will accept a GROUP BY query in a DELETE statement...


The attached query will get you all duplicates in the table and will have a 2 in the f95_test column when FMLA is present in the duplicates or a 1 when NAPLES is, but not FMLA. 0 when neither appears.
STREET_NUMBER, STREET_NAME, STREET_TYPE, STREET_DIRECTION, UNIT_NUMBER, CITY, ZIP_CODE, MAX(CASE WHEN f95 = "FMLA" THEN 2 ELSE CASE WHEN f95 = "NAPLES" then 1 ELSE 0 END END) as f95_test

FROM YOUR_TABLE

GROUP BY STREET_NUMBER, STREET_NAME, STREET_TYPE, STREET_DIRECTION, UNIT_NUMBER, CITY, ZIP_CODE

HAVING count(bID) > 1

Open in new window

Hi pda,

One more question.  About how many rows are in the table?  The simplest query to write/understand isn't very efficient, but if the data set is relatively small, that won't be an issue.


Kent
Avatar of pda4me

ASKER

< 50K records

Ok.  This something like this should work.  So should a variation of what Dirk posted.


Kent

INSERT INTO newtable
SELECT * FROM oldtable
WHERE fID in 
(
  SELECT
    coalesce (f95_fmfl, f95_naples, f95_bear)
  FROM
  (
    SELECT DISTINCT 
      STREET_NUMBER,
      STREET_NAME,
      STREET_TYPE,
      STREET_DIRECTION,
      UNIT_NUMBER,
      CITY,
      ZIP_CODE,
      (
        SELECT id 
        FROM mytable t1 
        WHERE t0.STREET_NUMBER = t1.STREET_NUMBER
          AND t0.STREET_NAME = t1.STREET_NAME
          AND t0.STREET_TYPE = t1.STREET_TYPE
          AND t0.STREET_DIRECTION = t1. STREET_DIRECTION
          AND t0.UNIT_NUMBER = t1.UNIT_NUMBER
          AND t0.CITY = T1.CITY
          AND t0.ZIP_CODE = t1.ZIP_CODE
          AND t1.f95 = 'FMFL'
      ) as f95_fmfl,
      (
        SELECT id 
        FROM mytable t2 
        WHERE t0.STREET_NUMBER = t2.STREET_NUMBER
          AND t0.STREET_NAME = t2.STREET_NAME
          AND t0.STREET_TYPE = t2.STREET_TYPE
          AND t0.STREET_DIRECTION = t2. STREET_DIRECTION
          AND t0.UNIT_NUMBER = t2.UNIT_NUMBER
          AND t0.CITY = T2.CITY
          AND t0.ZIP_CODE = t2.ZIP_CODE
          AND t1.f95 = 'NAPLES'
      ) as f95_naples,
      (
        SELECT id 
        FROM mytable t1 
        WHERE t0.STREET_NUMBER = t3.STREET_NUMBER
          AND t0.STREET_NAME = t3.STREET_NAME
          AND t0.STREET_TYPE = t3.STREET_TYPE
          AND t0.STREET_DIRECTION = t3. STREET_DIRECTION
          AND t0.UNIT_NUMBER = t3.UNIT_NUMBER
          AND t0.CITY = T3.CITY
          AND t0.ZIP_CODE = t3.ZIP_CODE
          AND t1.f95 = 'BEAR'
      ) as f95_bear
  ) t10
)

Open in new window

Avatar of pda4me

ASKER

Weird, I am getting a 1054 - Unknown column 'STREET_NUMBER' in 'field list'

I verfied that this is a valid column in the table?

Oops.  :)

After line 52 add this:

  FROM mytable t0



  :)

Kent
Avatar of pda4me

ASKER

Close, here is the script with database names in it...I am now getting a

1054 - Unknown column 'T1.CITY' in 'where clause'  

Query:
INSERT INTO idx_Bear_FMFL_Naples_Final
SELECT * FROM idx_Bear_FMFL_Naples
WHERE MLS_LISTING_ID in 
(
  SELECT
    coalesce (f95_fmfl, f95_naples, f95_bear)
  FROM
  (
    SELECT DISTINCT 
      STREET_NUMBER,
      STREET_NAME,
      STREET_TYPE,
      STREET_DIRECTION,
      UNIT_NUMBER,
      CITY,
      ZIP_CODE,
      (
        SELECT MLS_LISTING_ID
        FROM  idx_Bear_FMFL_Naples t1 
        WHERE t0.STREET_NUMBER = t1.STREET_NUMBER
          AND t0.STREET_NAME = t1.STREET_NAME
          AND t0.STREET_TYPE = t1.STREET_TYPE
          AND t0.STREET_DIRECTION = t1. STREET_DIRECTION
          AND t0.UNIT_NUMBER = t1.UNIT_NUMBER
          AND t0.CITY = T1.CITY
          AND t0.ZIP_CODE = t1.ZIP_CODE
          AND t1.f95 = 'FMFL'
      ) as f95_fmfl,
      (
        SELECT MLS_LISTING_ID
        FROM idx_Bear_FMFL_Naples t2 
        WHERE t0.STREET_NUMBER = t2.STREET_NUMBER
          AND t0.STREET_NAME = t2.STREET_NAME
          AND t0.STREET_TYPE = t2.STREET_TYPE
          AND t0.STREET_DIRECTION = t2. STREET_DIRECTION
          AND t0.UNIT_NUMBER = t2.UNIT_NUMBER
          AND t0.CITY = T2.CITY
          AND t0.ZIP_CODE = t2.ZIP_CODE
          AND t1.f95 = 'NAPLES'
      ) as f95_naples,
      (
        SELECT MLS_LISTING_ID
        FROM idx_Bear_FMFL_Naples t1 
        WHERE t0.STREET_NUMBER = t3.STREET_NUMBER
          AND t0.STREET_NAME = t3.STREET_NAME
          AND t0.STREET_TYPE = t3.STREET_TYPE
          AND t0.STREET_DIRECTION = t3. STREET_DIRECTION
          AND t0.UNIT_NUMBER = t3.UNIT_NUMBER
          AND t0.CITY = T3.CITY
          AND t0.ZIP_CODE = t3.ZIP_CODE
          AND t1.f95 = 'BEAR'
      ) as f95_bear
FROM idx_Bear_FMFL_Naples t0
  ) t10
)

Open in new window

SOLUTION
Avatar of dirknibleck
dirknibleck

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial

Line 43.  The alias should be T3, not T1.


Avatar of pda4me

ASKER

closer, still getting this error:

1054 - Unknown column 't1.f95' in 'where clause'

Updated Code:

INSERT INTO idx_Bear_FMFL_Naples_Final
SELECT * FROM idx_Bear_FMFL_Naples
WHERE MLS_LISTING_ID in 
(
  SELECT
    coalesce (f95_FMFL, f95_Naples, f95_BEAR)
  FROM
  (
    SELECT DISTINCT 
      STREET_NUMBER,
      STREET_NAME,
      STREET_TYPE,
      STREET_DIRECTION,
      UNIT_NUMBER,
      CITY,
      ZIP_CODE,
      (
        SELECT MLS_LISTING_ID
        FROM  idx_Bear_FMFL_Naples t1 
        WHERE t0.STREET_NUMBER = t1.STREET_NUMBER
          AND t0.STREET_NAME = t1.STREET_NAME
          AND t0.STREET_TYPE = t1.STREET_TYPE
          AND t0.STREET_DIRECTION = t1. STREET_DIRECTION
          AND t0.UNIT_NUMBER = t1.UNIT_NUMBER
          AND t0.CITY = t1.CITY
          AND t0.ZIP_CODE = t1.ZIP_CODE
          AND t1.f95 = 'FMFL'
      ) as f95_FMFL,
      (
        SELECT MLS_LISTING_ID
        FROM idx_Bear_FMFL_Naples t2 
        WHERE t0.STREET_NUMBER = t2.STREET_NUMBER
          AND t0.STREET_NAME = t2.STREET_NAME
          AND t0.STREET_TYPE = t2.STREET_TYPE
          AND t0.STREET_DIRECTION = t2. STREET_DIRECTION
          AND t0.UNIT_NUMBER = t2.UNIT_NUMBER
          AND t0.CITY = t2.CITY
          AND t0.ZIP_CODE = t2.ZIP_CODE
          AND t1.f95 = 'Naples'
      ) as f95_Naples,
      (
        SELECT MLS_LISTING_ID
        FROM idx_Bear_FMFL_Naples t3 
        WHERE t0.STREET_NUMBER = t3.STREET_NUMBER
          AND t0.STREET_NAME = t3.STREET_NAME
          AND t0.STREET_TYPE = t3.STREET_TYPE
          AND t0.STREET_DIRECTION = t3. STREET_DIRECTION
          AND t0.UNIT_NUMBER = t3.UNIT_NUMBER
          AND t0.CITY = t3.CITY
          AND t0.ZIP_CODE = t3.ZIP_CODE
          AND t1.f95 = 'BEAR'
      ) as f95_BEAR
FROM idx_Bear_FMFL_Naples t0
  ) t10
)

Open in new window

ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of pda4me

ASKER

Thanks!
>> The T in T1.CITY is capital (also T2.CITY and T3.CITY) it should be lowercase: t1.CITY, t2.CITY, t3.CITY

SQL is NOT case sensitive.  Upper case, lower case, and mixed case should all work equally well.



Kent