Solved

UNION but avoid duplicated data

Posted on 2010-09-10
9
458 Views
Last Modified: 2012-05-10
Hi,

I have two tables : patients and patients_guest.

I would like to have the patients who have a lastname starting by X (parameter) in both tables. I think that's a simple UNION with: WHERE P.lastname LIKE "' . $x . '%"
The problem is that a patient may be in both tables. I need to check that there is only one lastname-firstname-birthDate returned (note that there may be two patients with same lastname and firstname, that's why they have to have the same birthday to be considered as duplicated).

The fields to return are :
id
lastname
firstname
date_of_birth
is_guest
(this must be generated, should be true if the record is from patients_guest, false otherwise).

Thank you for any help
0
Comment
Question by:matthew016
  • 5
  • 3
9 Comments
 
LVL 58

Expert Comment

by:cyberkiwi
ID: 33650835
Just change UNION to UNION ALL
It will remove the duplicates
0
 
LVL 3

Assisted Solution

by:packratt_jk
packratt_jk earned 100 total points
ID: 33650850
select distinct id, lastname, firstname, date_of_birth from (
select distinct id, lastname, firstname, date_of_birth from table1
union
select distinct id, lastname, firstname, date_of_birth from table2
)
0
 
LVL 58

Accepted Solution

by:
cyberkiwi earned 400 total points
ID: 33650853
This will give you no duplicates, but when there are duplicates, it will RANDOMLY/ARBITRARILY show one of the two IDs and one of the is_guest values.

select lastname,firstname,date_of_birth, id, is_guest
from
(
select id,lastname,firstname,date_of_birth,'false' is_guest
from patients
union
select id,lastname,firstname,date_of_birth,'true' is_guest
from patients_guest
) both
group by lastname,firstname,date_of_birth
0
 
LVL 9

Author Comment

by:matthew016
ID: 33650854
how will UNION ALL know that duplicated data is data with same lastname, firstname and date_of_birth (id may be differnet) ?
0
Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

 
LVL 9

Author Comment

by:matthew016
ID: 33650862
@cyberkiwi

Thanks, is it possible to have the record from table "patients" when there is a duplicate, and not from "patients_guest" ?

Maybe a AND NOT EXISTS or something should be added to the second part of the union ?
0
 
LVL 58

Expert Comment

by:cyberkiwi
ID: 33650864
Because the UNION is a "virtual" dataset, this very complicated query is needed to properly return just 1 row from duplicates.
But even this is not perfect, if you have

firstname, lastname, date_of_birth AND id duplicated in both tables, the query will give you both rows since it cannot distinguish between the two.
select B.*

from

(

select min(id) min_id, lastname,firstname,date_of_birth

from

(

select id,lastname,firstname,date_of_birth,'false' is_guest

from patients

union

select id,lastname,firstname,date_of_birth,'true' is_guest

from patients_guest

) both

group by lastname,firstname,date_of_birth

) A

inner join

(

select id,lastname,firstname,date_of_birth,'false' is_guest

from patients

union

select id,lastname,firstname,date_of_birth,'true' is_guest

from patients_guest

) B on A.lastname=B.lastname and a.date_of_birth=b.date_of_birth and a.firstname=b.firstname and A.min_id=B.id

Open in new window

0
 
LVL 58

Expert Comment

by:cyberkiwi
ID: 33650879
> Thanks, is it possible to have the record from table "patients" when there is a duplicate, and not from "patients_guest" ?
Brilliant - that makes it easier if there is a preference.

FYI - I just woke up - UNION is the one that removes duplicates and UNION ALL does not.
-- all patients

-- patients_guest only when not already in patients by (lastname,firstname,date_of_birth)

select id,lastname,firstname,date_of_birth,'false' is_guest

from patients

union all

select A.id,A.lastname,A.firstname,A.date_of_birth,'true' is_guest

from patients_guest A

left join patients B on A.lastname=B.lastname and a.date_of_birth=b.date_of_birth and a.firstname=b.firstname

where B.id is null

Open in new window

0
 
LVL 9

Author Comment

by:matthew016
ID: 33651086
Thanks,

So what should I use in my case, UNION or UNION ALL ?

I'll post a second question about a problem with the data returned, this is enough for now :)
0
 
LVL 58

Expert Comment

by:cyberkiwi
ID: 33653137
Normally if you want duplicate data (defined as begin equal on every single column) removed, use UNION.
However it has a cost, since it is performing a DISTINCT filter.

In the case of the last query presented above, we already know (by design) that the two parts are distinct, so using UNION ALL is just slightly faster to avoid the additional duplicate filtering process.
0

Featured Post

What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Foreword In the years since this article was written, numerous hacking attacks have targeted password-protected web sites.  The storage of client passwords has become a subject of much discussion, some of it useful and some of it misguided.  Of cou…
Introduction In this article, I will by showing a nice little trick for MySQL similar to that of my previous EE Article for SQLite (http://www.sqlite.org/), A SQLite Tidbit: Quick Numbers Table Generation (http://www.experts-exchange.com/A_3570.htm…
Migrating to Microsoft Office 365 is becoming increasingly popular for organizations both large and small. If you have made the leap to Microsoft’s cloud platform, you know that you will need to create a corporate email signature for your Office 365…
Internet Business Fax to Email Made Easy - With  eFax Corporate (http://www.enterprise.efax.com), you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, f…

863 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now