BHUC
asked on
Finding duplicates sql only displaying one
I have this query that finds duplicate records. What I need it to do is display ONE of the records. Example, I have two records with PATNUM of 12345 and SURGEON of SMITH, I want it to only display ONE time in my results.
SELECT * FROM patientinfo1 a WHERE exists(select 1 from patientinfo1 where ID=a.ID group by PATNUM,SURGEON having count(*)>1)
SELECT * FROM patientinfo1 a WHERE exists(select 1 from patientinfo1 where ID=a.ID group by PATNUM,SURGEON having count(*)>1)
ASKER
hielo, I need the * because there are like 20 other fields I need to pull from each row to display. Would I have to use MAX on each one?
So, if you have:
ID PATNUM SURGEON NURSE
1 12345 SMITH Joyce
2 12345 SMITH Jane
3 12345 SMITH Joyce
4 12345 SMITH Sally
5 12345 SMITH Jane
What's your expected output?
ID PATNUM SURGEON NURSE
1 12345 SMITH Joyce
2 12345 SMITH Jane
3 12345 SMITH Joyce
4 12345 SMITH Sally
5 12345 SMITH Jane
What's your expected output?
ASKER
Only ONE of the rows to show up. Doesn't matter which one.
ASKER
More clarification:
ID PATNUM SURGEON NURSE
1 12345 TEST Jane
2 12345 TEST Jill
3 54321 Smith Sam
4 54321 Johnson Sally
5 12345 Johnson Jane
I would expect results of:
either row 1 or 2 to appear (Same PATNUM and SURGEON)
Row 3, 4 (Same PATNUM, different SURGEON)
and 5 (SAME PATNUM as 1 and 2, Different Surgeon)
ID PATNUM SURGEON NURSE
1 12345 TEST Jane
2 12345 TEST Jill
3 54321 Smith Sam
4 54321 Johnson Sally
5 12345 Johnson Jane
I would expect results of:
either row 1 or 2 to appear (Same PATNUM and SURGEON)
Row 3, 4 (Same PATNUM, different SURGEON)
and 5 (SAME PATNUM as 1 and 2, Different Surgeon)
OK, then use the query I posted earlier. You will need to use Max() for each of the other fields that do not form part of the GROUP BY clause.
select PATNUM,SURGEON, Max(`NURSE`),MAX(ID) from patientinfo1 group by PATNUM,SURGEON having count(*)>1
On another note, regarding:
>> ...Doesn't matter which one.
Then why even include those other fields? If it doesn't matter which one, then it seems to me that the only data that is useful/reliable are PATNUM and SURGEON. So why not just use:
select PATNUM,SURGEON from patientinfo1 group by PATNUM,SURGEON having count(*)>1
Or if you need the other fields, you could even use GROUP_CONCAT()
select PATNUM, SURGEON, GROUP_CONCAT(CAST(NURSE AS CHAR)) as NURSE from patientinfo1 group by PATNUM, SURGEON having count(*)>1
but you still need to list the other fields if you choose to use GROUP_CONCAT()
select PATNUM,SURGEON, Max(`NURSE`),MAX(ID) from patientinfo1 group by PATNUM,SURGEON having count(*)>1
On another note, regarding:
>> ...Doesn't matter which one.
Then why even include those other fields? If it doesn't matter which one, then it seems to me that the only data that is useful/reliable are PATNUM and SURGEON. So why not just use:
select PATNUM,SURGEON from patientinfo1 group by PATNUM,SURGEON having count(*)>1
Or if you need the other fields, you could even use GROUP_CONCAT()
select PATNUM, SURGEON, GROUP_CONCAT(CAST(NURSE AS CHAR)) as NURSE from patientinfo1 group by PATNUM, SURGEON having count(*)>1
but you still need to list the other fields if you choose to use GROUP_CONCAT()
ASKER
The other fields are like schedule_time, Surgery, Task, PA, etc. If it is the same patient number and same surgeon, then a few fields may be different, but it is the same surgeon so it doesn't matter which one comes up. Surgery time, arrival time will all be the same. Only thing that will be different is what surgery is being done.
if it is same patient, different surgeon, then surgery time for each surgeon will be different, along with other notes, consulting with family for each surgeon, etc.
if it is same patient, different surgeon, then surgery time for each surgeon will be different, along with other notes, consulting with family for each surgeon, etc.
ASKER
Hielo,
Here is data in my database:
ID PATNUM AGE SEX SCHDT SCHTM SURGEON TASK SERVTYPE
1 12345 55 F 80315 13:45 SMITH Left Wrist OP
2 12345 55 F 80315 14:15 SMITH Left Foot OP
3 54321 15 M 80315 15:00 SMITH Rt Shoulder OP
4 54321 15 M 80315 16:00 JOHNS Tonsils OP
I used this sql
select PATNUM,SURGEON, Max(`TASK`), Max('AGE'), MAX('SERVTYPE'),MAX('SCHTM ') from patientinfo1 group by PATNUM,SURGEON having count(*)>1
I get this result
PATNUM SURGEON Max('TASK') Max ('AGE') MAX('SERVTYPE') MAX('SCHTM')
12345 SMITH Left Wrist AGE SERVTYPE SCHTM
I need this result
PATNUM AGE SURGEON TASK SERVTYPE SCHTM
12345 55 SMITH Left Wrist OP 13:45
54321 15 SMITH Rt Shoulder OP 15:00
54321 15 JOHNS Tonsils OP 16:00
I wanted to test on a few fields before I typed it all in.
Here is data in my database:
ID PATNUM AGE SEX SCHDT SCHTM SURGEON TASK SERVTYPE
1 12345 55 F 80315 13:45 SMITH Left Wrist OP
2 12345 55 F 80315 14:15 SMITH Left Foot OP
3 54321 15 M 80315 15:00 SMITH Rt Shoulder OP
4 54321 15 M 80315 16:00 JOHNS Tonsils OP
I used this sql
select PATNUM,SURGEON, Max(`TASK`), Max('AGE'), MAX('SERVTYPE'),MAX('SCHTM
I get this result
PATNUM SURGEON Max('TASK') Max ('AGE') MAX('SERVTYPE') MAX('SCHTM')
12345 SMITH Left Wrist AGE SERVTYPE SCHTM
I need this result
PATNUM AGE SURGEON TASK SERVTYPE SCHTM
12345 55 SMITH Left Wrist OP 13:45
54321 15 SMITH Rt Shoulder OP 15:00
54321 15 JOHNS Tonsils OP 16:00
I wanted to test on a few fields before I typed it all in.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
OK.. that got me one result - showing one of the duplicated ones but none of the others.
Result is:
PATNUM AGE SURGEON TASK SERVTYPE SCHTM
12345 55 SMITH Left Wrist OP 13:45
Missing the two with PATNUM of 54321
Result is:
PATNUM AGE SURGEON TASK SERVTYPE SCHTM
12345 55 SMITH Left Wrist OP 13:45
Missing the two with PATNUM of 54321
ASKER
Never mind Hielo - you answered my question.. I can use group by for the other. Thank you.
>> OK.. that got me one result - showing one of the duplicated ones but none of the others.
OK, then omit the HAVING clause
OK, then omit the HAVING clause
select `PATNUM`,`SURGEON`, Max(`TASK`) as `TASK`, Max(`AGE`) as `AGE`, MAX(`SERVTYPE`) as `SERVTYPE`, MAX(`SCHTM`) as `SCHTM` from `patientinfo1` group by `PATNUM`, `SURGEON`
Open in new window
If you have other 'non-group' fields (similar to `nurse` above) then just add another Max() clause for each of them.