asked on

Finding duplicates sql only displaying one

I have this query that finds duplicate records. What I need it to do is display ONE of the records. Example, I have two records with PATNUM of 12345 and SURGEON of SMITH, I want it to only display ONE time in my results.

SELECT * FROM patientinfo1 a WHERE exists(select 1 from patientinfo1 where ID=a.ID group by PATNUM,SURGEON having count(*)>1)

hielo

Get rid of the "* and use the sub query only. Let's say that you also have a field named `nurse` then try.

select PATNUM,SURGEON, Max(`nurse`) from patientinfo1 group by PATNUM,SURGEON having count(*)>1

Open in new window

If you have other 'non-group' fields (similar to `nurse` above) then just add another Max() clause for each of them.

BHUC

ASKER

hielo, I need the * because there are like 20 other fields I need to pull from each row to display. Would I have to use MAX on each one?

hielo

So, if you have:
ID PATNUM SURGEON NURSE
1 12345 SMITH Joyce
2 12345 SMITH Jane
3 12345 SMITH Joyce
4 12345 SMITH Sally
5 12345 SMITH Jane

What's your expected output?

BHUC

ASKER

Only ONE of the rows to show up. Doesn't matter which one.

BHUC

ASKER

More clarification:
ID PATNUM SURGEON NURSE
1 12345 TEST Jane
2 12345 TEST Jill
3 54321 Smith Sam
4 54321 Johnson Sally
5 12345 Johnson Jane

I would expect results of:
either row 1 or 2 to appear (Same PATNUM and SURGEON)
Row 3, 4 (Same PATNUM, different SURGEON)
and 5 (SAME PATNUM as 1 and 2, Different Surgeon)

hielo

OK, then use the query I posted earlier. You will need to use Max() for each of the other fields that do not form part of the GROUP BY clause.
select PATNUM,SURGEON, Max(`NURSE`),MAX(ID) from patientinfo1 group by PATNUM,SURGEON having count(*)>1

On another note, regarding:
>> ...Doesn't matter which one.
Then why even include those other fields? If it doesn't matter which one, then it seems to me that the only data that is useful/reliable are PATNUM and SURGEON. So why not just use:
select PATNUM,SURGEON from patientinfo1 group by PATNUM,SURGEON having count(*)>1

Or if you need the other fields, you could even use GROUP_CONCAT()
select PATNUM, SURGEON, GROUP_CONCAT(CAST(NURSE AS CHAR)) as NURSE from patientinfo1 group by PATNUM, SURGEON having count(*)>1

but you still need to list the other fields if you choose to use GROUP_CONCAT()

BHUC

ASKER

The other fields are like schedule_time, Surgery, Task, PA, etc. If it is the same patient number and same surgeon, then a few fields may be different, but it is the same surgeon so it doesn't matter which one comes up. Surgery time, arrival time will all be the same. Only thing that will be different is what surgery is being done.

if it is same patient, different surgeon, then surgery time for each surgeon will be different, along with other notes, consulting with family for each surgeon, etc.

BHUC

ASKER

Hielo,

Here is data in my database:
ID PATNUM AGE SEX SCHDT SCHTM SURGEON TASK SERVTYPE
1 12345 55 F 80315 13:45 SMITH Left Wrist OP
2 12345 55 F 80315 14:15 SMITH Left Foot OP
3 54321 15 M 80315 15:00 SMITH Rt Shoulder OP
4 54321 15 M 80315 16:00 JOHNS Tonsils OP

I used this sql
select PATNUM,SURGEON, Max(`TASK`), Max('AGE'), MAX('SERVTYPE'),MAX('SCHTM') from patientinfo1 group by PATNUM,SURGEON having count(*)>1

I get this result

PATNUM SURGEON Max('TASK') Max ('AGE') MAX('SERVTYPE') MAX('SCHTM')
12345 SMITH Left Wrist AGE SERVTYPE SCHTM

I need this result

PATNUM AGE SURGEON TASK SERVTYPE SCHTM
12345 55 SMITH Left Wrist OP 13:45
54321 15 SMITH Rt Shoulder OP 15:00
54321 15 JOHNS Tonsils OP 16:00

I wanted to test on a few fields before I typed it all in.

ASKER CERTIFIED SOLUTION

hielo

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

BHUC

ASKER

OK.. that got me one result - showing one of the duplicated ones but none of the others.

Result is:

PATNUM AGE SURGEON TASK SERVTYPE SCHTM
12345 55 SMITH Left Wrist OP 13:45

Missing the two with PATNUM of 54321

BHUC

ASKER

Never mind Hielo - you answered my question.. I can use group by for the other. Thank you.

hielo

>> OK.. that got me one result - showing one of the duplicated ones but none of the others.
OK, then omit the HAVING clause

select `PATNUM`,`SURGEON`, Max(`TASK`) as `TASK`, Max(`AGE`) as `AGE`, MAX(`SERVTYPE`) as `SERVTYPE`, MAX(`SCHTM`) as `SCHTM` from `patientinfo1` group by `PATNUM`, `SURGEON`

Open in new window