Solved

Remove duplicate records from this SQL Query

Posted on 2007-12-05
20
309 Views
Last Modified: 2010-03-20
I'm performing a query using the Timberline (Simba) ODBC.  My last join started to pull duplicate records.  The lease, or tenant.name field should be unique.  

Can anyone help me make this query remove duplicate records?

select
        lease_pml_lease.lease,tenant_pmn_tenant.name,lease_pml_lease.property,
        lease_pml_lease.primary_unit,lease_pml_lease.status,
        lease_pml_lease_recurring_charge.last_amount,
        tenant_pmn_tenant_address.contact_phone,tenant_pmn_tenant_address.Contact_fax,
        lease_pml_lease.Original_Start,lease_pml_lease.actual_start,
        lease_pml_lease.Original_Lease_end,lease_pml_lease.Actual_lease_end,
        lease_pml_lease.Actual_Movein_date,lease_pml_lease.actual_moveout_date
from lease_pml_lease
inner join tenant_pmn_tenant on  primary_tenant = tenant_pmn_tenant.tenant
inner join TENANT_PMN_TENANT_ADDRESS on primary_tenant = TENANT_PMN_TENANT_ADDRESS.tenant
inner join lease_pml_lease_recurring_charge on  lease_pml_lease.lease = lease_pml_lease_recurring_charge.lease
where status='Current' and lease_pml_lease_recurring_charge.charge_type='Rent'

Thanks,
0
Comment
Question by:looknow12
  • 9
  • 9
20 Comments
 
LVL 3

Expert Comment

by:randomjames
ID: 20411741
It looks like there is a flaw in the logic if there are multiple records in lease_pml_lease_recurring_charge.

Is that correct? If so, what's the difference in those records?

I ask because I would think that if all the duplicates have the same values for all fields then you could just SELECT DISTINCT,

but if they don't (and you need to describe which fields of your duplicate records are the same and which ones are different) then you may have to group and summarize.
0
 

Author Comment

by:looknow12
ID: 20411823
The lease field from the primary table should be unique... lease_pml_lease.lease
0
 
LVL 3

Expert Comment

by:randomjames
ID: 20411848
So are your duplicates identical on all other fields?
0
 

Author Comment

by:looknow12
ID: 20412058
Not always, but they are definitely duplicates in the lease field.  As long as I get unique records based on this lease field I'm happy.
0
 
LVL 3

Expert Comment

by:randomjames
ID: 20412188
I would expect that the field lease_pml_lease_recurring_charge.last_amount needs to be summarized in some way, or you really need to join to the results of a subquery instead of lease_pml_lease_recurring_charge that selects one single last_amount based on the MAX(date) for each lease.

Try this, I think it will give you no duplicates but I doubt it's the actual value for last_amount that you want to use...
select

	lease_pml_lease.lease,

	tenant_pmn_tenant.name,

	lease_pml_lease.property,

	lease_pml_lease.primary_unit,

	lease_pml_lease.status,

	MAX(lease_pml_lease_recurring_charge.last_amount),

	tenant_pmn_tenant_address.contact_phone,

	tenant_pmn_tenant_address.Contact_fax,

	lease_pml_lease.Original_Start,

	lease_pml_lease.actual_start,

	lease_pml_lease.Original_Lease_end,

	lease_pml_lease.Actual_lease_end,

	lease_pml_lease.Actual_Movein_date,

	lease_pml_lease.actual_moveout_date

from 

	lease_pml_lease

	inner join tenant_pmn_tenant on  primary_tenant = tenant_pmn_tenant.tenant

	inner join TENANT_PMN_TENANT_ADDRESS on primary_tenant = TENANT_PMN_TENANT_ADDRESS.tenant

	inner join lease_pml_lease_recurring_charge on  lease_pml_lease.lease = lease_pml_lease_recurring_charge.lease

where 

	status='Current' 

	and lease_pml_lease_recurring_charge.charge_type='Rent'

group by

	lease_pml_lease.lease,

	tenant_pmn_tenant.name,

	lease_pml_lease.property,

	lease_pml_lease.primary_unit,

	lease_pml_lease.status,

	tenant_pmn_tenant_address.contact_phone,

	tenant_pmn_tenant_address.Contact_fax,

	lease_pml_lease.Original_Start,

	lease_pml_lease.actual_start,

	lease_pml_lease.Original_Lease_end,

	lease_pml_lease.Actual_lease_end,

	lease_pml_lease.Actual_Movein_date,

	lease_pml_lease.actual_moveout_date

Open in new window

0
 
LVL 3

Expert Comment

by:randomjames
ID: 20412268
I guess I mean something like this (best guess not knowing your tables or any Simba SQL quirks);
select

	lease_pml_lease.lease,

	tenant_pmn_tenant.name,

	lease_pml_lease.property,

	lease_pml_lease.primary_unit,

	lease_pml_lease.status,

	Get_last_amount_table.last_amount,

	tenant_pmn_tenant_address.contact_phone,

	tenant_pmn_tenant_address.Contact_fax,

	lease_pml_lease.Original_Start,

	lease_pml_lease.actual_start,

	lease_pml_lease.Original_Lease_end,

	lease_pml_lease.Actual_lease_end,

	lease_pml_lease.Actual_Movein_date,

	lease_pml_lease.actual_moveout_date

from 

	lease_pml_lease

	inner join tenant_pmn_tenant on  primary_tenant = tenant_pmn_tenant.tenant

	inner join TENANT_PMN_TENANT_ADDRESS on primary_tenant = TENANT_PMN_TENANT_ADDRESS.tenant

	inner join 

		(select

			lease_pml_lease_recurring_charge.lease,

			lease_pml_lease_recurring_charge.last_amount,

			max(lease_pml_lease_recurring_charge.{perhaps the date of the recurring charge?}),

		from

			lease_pml_lease_recurring_charge

		where

			lease_pml_lease_recurring_charge.charge_type='Rent'

		group by

			lease_pml_lease_recurring_charge.lease,

			lease_pml_lease_recurring_charge.last_amount

		) as Get_last_amount_table on lease_pml_lease.lease = Get_last_amount_table.lease

where 

	status='Current' 

Open in new window

0
 

Author Comment

by:looknow12
ID: 20412318
To the first query, it is better, but there are still duplicates based on the lease field.

The second query comes up with a "General Error"
0
 

Author Comment

by:looknow12
ID: 20412360
if I try to restrict the group by to just the lease or maybe the lease and tenant name, I receive this error.

>[Error] Script lines: 1-27 -------------------------
 [TimberlineODBC][TimberlineODBCEngine ODBC Driver]Column invalid. Must be a group by column: property in SELECT LIST. 
0
 
LVL 3

Expert Comment

by:randomjames
ID: 20412379
I'm sure the second one would error out if you didn't change out where I put this;

{perhaps the date of the recurring charge?}

I don't know your tables...I can't fill that in. It was really just supposed to be an example.

I have to ask you again, this is important if you are still getting duplicates; what fields are different in the duplicates? Those, obviously, are the cause of the duplicate records...
0
How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

 
LVL 3

Expert Comment

by:randomjames
ID: 20412404
No, what you tried would not help. Every field that isn't summarized in some way (with a max, sum, count, min, etc) needs to be in the group by section.

Besides, that first example I gave you would give you this highest "last_amount" which may or may not be what you are looking for.
0
 

Author Comment

by:looknow12
ID: 20412449
Ahh in the address table, the contact_phone.

Any way to place the first duplicate record's contact field in one column and the next duplicate field in another column.  
0
 
LVL 3

Expert Comment

by:randomjames
ID: 20412609
Here's a way. Notice this will return just one fax number, so if there are two you need to modify this as well. Also, you would still need to fix the line with {perhaps the date of the recurring charge?};
select

        lease_pml_lease.lease,

        tenant_pmn_tenant.name,

        lease_pml_lease.property,

        lease_pml_lease.primary_unit,

        lease_pml_lease.status,

        Get_last_amount_table.last_amount,

        Get_both_phones.contact_phone1,

        Get_both_phones.contact_phone2,

        Get_both_phones.Contact_fax,

        lease_pml_lease.Original_Start,

        lease_pml_lease.actual_start,

        lease_pml_lease.Original_Lease_end,

        lease_pml_lease.Actual_lease_end,

        lease_pml_lease.Actual_Movein_date,

        lease_pml_lease.actual_moveout_date

from 

        lease_pml_lease

        inner join tenant_pmn_tenant on  primary_tenant = tenant_pmn_tenant.tenant

        inner join 

		(select

			tenant_pmn_tenant_address.tenant,

			max(tenant_pmn_tenant_address.Contact_fax) as contact_fax,

			max(tenant_pmn_tenant_address.contact_phone) as contact_phone1,

			min(tenant_pmn_tenant_address.contact_phone) as contact_phone2

		from

			tenant_pmn_tenant_address

		group by

			tenant_pmn_tenant_address.tenant

		) as Get_both_phones on on primary_tenant = Get_both_phones.tenant

        inner join 

                (select

                        lease_pml_lease_recurring_charge.lease,

                        lease_pml_lease_recurring_charge.last_amount,

                        max(lease_pml_lease_recurring_charge.{perhaps the date of the recurring charge?}),

                from

                        lease_pml_lease_recurring_charge

                where

                        lease_pml_lease_recurring_charge.charge_type='Rent'

                group by

                        lease_pml_lease_recurring_charge.lease,

                        lease_pml_lease_recurring_charge.last_amount

                ) as Get_last_amount_table on lease_pml_lease.lease = Get_last_amount_table.lease

where 

        status='Current' 

Open in new window

0
 

Author Comment

by:looknow12
ID: 20412705
General Error :-( Is there any punctuation missing?
0
 
LVL 3

Expert Comment

by:randomjames
ID: 20412774
Did you replace this;

    {perhaps the date of the recurring charge?}

with the appropriate date field?

I do see a punctuation error...this line needs to have the comma removed as well;

                        max(lease_pml_lease_recurring_charge.{perhaps the date of the recurring charge?}),
0
 

Author Comment

by:looknow12
ID: 20412975
I'm running this.  BTW, I don't care about the date field.  Also, ease_pml_lease_recurring_charge.charge_type can be "Rent"
0
 

Author Comment

by:looknow12
ID: 20413049
ohh, I see you have Rent as the charge type.  I've been looking at this through remote desktop and just now adjusted my resolution so I can see more of the query on the screen.  It was tough to look at the entire query.
0
 

Author Comment

by:looknow12
ID: 20413158
The other field that should be unique (not sure if this matters, but) and tenant.name, lease.property.

But I am good with the lease or the name only.  If those are unique it is fine.
0
 
LVL 3

Accepted Solution

by:
randomjames earned 500 total points
ID: 20413433
If they are not unique then of course they will result in duplicates so it's important that all fields are unique.

If other fields are not unique then they will need to be handled in a similar way or you can just add grouping to the main query.

This is not optimized SQL and I would expect that it could be streamlined. I was just hoping to convey general principles in my examples.
0
 
LVL 1

Expert Comment

by:Computer101
ID: 20962568
Forced accept.

Computer101
Community Support Moderator
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

If you find yourself in this situation “I have used SELECT DISTINCT but I’m getting duplicates” then I'm sorry to say you are using the wrong SQL technique as it only does one thing which is: produces whole rows that are unique. If the results you a…
Confronted with some SQL you don't know can be a daunting task. It can be even more daunting if that SQL carries some of the old secret codes used in the Ye Olde query syntax, such as: (+)     as used in Oracle;     *=     =*    as used in Sybase …
Illustrator's Shape Builder tool will let you combine shapes visually and interactively. This video shows the Mac version, but the tool works the same way in Windows. To follow along with this video, you can draw your own shapes or download the file…
In this tutorial you'll learn about bandwidth monitoring with flows and packet sniffing with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're interested in additional methods for monitoring bandwidt…

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now