Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Remove duplicate records from this SQL Query

Posted on 2007-12-05
20
Medium Priority
?
318 Views
Last Modified: 2010-03-20
I'm performing a query using the Timberline (Simba) ODBC.  My last join started to pull duplicate records.  The lease, or tenant.name field should be unique.  

Can anyone help me make this query remove duplicate records?

select
        lease_pml_lease.lease,tenant_pmn_tenant.name,lease_pml_lease.property,
        lease_pml_lease.primary_unit,lease_pml_lease.status,
        lease_pml_lease_recurring_charge.last_amount,
        tenant_pmn_tenant_address.contact_phone,tenant_pmn_tenant_address.Contact_fax,
        lease_pml_lease.Original_Start,lease_pml_lease.actual_start,
        lease_pml_lease.Original_Lease_end,lease_pml_lease.Actual_lease_end,
        lease_pml_lease.Actual_Movein_date,lease_pml_lease.actual_moveout_date
from lease_pml_lease
inner join tenant_pmn_tenant on  primary_tenant = tenant_pmn_tenant.tenant
inner join TENANT_PMN_TENANT_ADDRESS on primary_tenant = TENANT_PMN_TENANT_ADDRESS.tenant
inner join lease_pml_lease_recurring_charge on  lease_pml_lease.lease = lease_pml_lease_recurring_charge.lease
where status='Current' and lease_pml_lease_recurring_charge.charge_type='Rent'

Thanks,
0
Comment
Question by:looknow12
  • 9
  • 9
19 Comments
 
LVL 3

Expert Comment

by:randomjames
ID: 20411741
It looks like there is a flaw in the logic if there are multiple records in lease_pml_lease_recurring_charge.

Is that correct? If so, what's the difference in those records?

I ask because I would think that if all the duplicates have the same values for all fields then you could just SELECT DISTINCT,

but if they don't (and you need to describe which fields of your duplicate records are the same and which ones are different) then you may have to group and summarize.
0
 

Author Comment

by:looknow12
ID: 20411823
The lease field from the primary table should be unique... lease_pml_lease.lease
0
 
LVL 3

Expert Comment

by:randomjames
ID: 20411848
So are your duplicates identical on all other fields?
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 

Author Comment

by:looknow12
ID: 20412058
Not always, but they are definitely duplicates in the lease field.  As long as I get unique records based on this lease field I'm happy.
0
 
LVL 3

Expert Comment

by:randomjames
ID: 20412188
I would expect that the field lease_pml_lease_recurring_charge.last_amount needs to be summarized in some way, or you really need to join to the results of a subquery instead of lease_pml_lease_recurring_charge that selects one single last_amount based on the MAX(date) for each lease.

Try this, I think it will give you no duplicates but I doubt it's the actual value for last_amount that you want to use...
select
	lease_pml_lease.lease,
	tenant_pmn_tenant.name,
	lease_pml_lease.property,
	lease_pml_lease.primary_unit,
	lease_pml_lease.status,
	MAX(lease_pml_lease_recurring_charge.last_amount),
	tenant_pmn_tenant_address.contact_phone,
	tenant_pmn_tenant_address.Contact_fax,
	lease_pml_lease.Original_Start,
	lease_pml_lease.actual_start,
	lease_pml_lease.Original_Lease_end,
	lease_pml_lease.Actual_lease_end,
	lease_pml_lease.Actual_Movein_date,
	lease_pml_lease.actual_moveout_date
from 
	lease_pml_lease
	inner join tenant_pmn_tenant on  primary_tenant = tenant_pmn_tenant.tenant
	inner join TENANT_PMN_TENANT_ADDRESS on primary_tenant = TENANT_PMN_TENANT_ADDRESS.tenant
	inner join lease_pml_lease_recurring_charge on  lease_pml_lease.lease = lease_pml_lease_recurring_charge.lease
where 
	status='Current' 
	and lease_pml_lease_recurring_charge.charge_type='Rent'
group by
	lease_pml_lease.lease,
	tenant_pmn_tenant.name,
	lease_pml_lease.property,
	lease_pml_lease.primary_unit,
	lease_pml_lease.status,
	tenant_pmn_tenant_address.contact_phone,
	tenant_pmn_tenant_address.Contact_fax,
	lease_pml_lease.Original_Start,
	lease_pml_lease.actual_start,
	lease_pml_lease.Original_Lease_end,
	lease_pml_lease.Actual_lease_end,
	lease_pml_lease.Actual_Movein_date,
	lease_pml_lease.actual_moveout_date

Open in new window

0
 
LVL 3

Expert Comment

by:randomjames
ID: 20412268
I guess I mean something like this (best guess not knowing your tables or any Simba SQL quirks);
select
	lease_pml_lease.lease,
	tenant_pmn_tenant.name,
	lease_pml_lease.property,
	lease_pml_lease.primary_unit,
	lease_pml_lease.status,
	Get_last_amount_table.last_amount,
	tenant_pmn_tenant_address.contact_phone,
	tenant_pmn_tenant_address.Contact_fax,
	lease_pml_lease.Original_Start,
	lease_pml_lease.actual_start,
	lease_pml_lease.Original_Lease_end,
	lease_pml_lease.Actual_lease_end,
	lease_pml_lease.Actual_Movein_date,
	lease_pml_lease.actual_moveout_date
from 
	lease_pml_lease
	inner join tenant_pmn_tenant on  primary_tenant = tenant_pmn_tenant.tenant
	inner join TENANT_PMN_TENANT_ADDRESS on primary_tenant = TENANT_PMN_TENANT_ADDRESS.tenant
	inner join 
		(select
			lease_pml_lease_recurring_charge.lease,
			lease_pml_lease_recurring_charge.last_amount,
			max(lease_pml_lease_recurring_charge.{perhaps the date of the recurring charge?}),
		from
			lease_pml_lease_recurring_charge
		where
			lease_pml_lease_recurring_charge.charge_type='Rent'
		group by
			lease_pml_lease_recurring_charge.lease,
			lease_pml_lease_recurring_charge.last_amount
		) as Get_last_amount_table on lease_pml_lease.lease = Get_last_amount_table.lease
where 
	status='Current' 

Open in new window

0
 

Author Comment

by:looknow12
ID: 20412318
To the first query, it is better, but there are still duplicates based on the lease field.

The second query comes up with a "General Error"
0
 

Author Comment

by:looknow12
ID: 20412360
if I try to restrict the group by to just the lease or maybe the lease and tenant name, I receive this error.

>[Error] Script lines: 1-27 -------------------------
 [TimberlineODBC][TimberlineODBCEngine ODBC Driver]Column invalid. Must be a group by column: property in SELECT LIST. 
0
 
LVL 3

Expert Comment

by:randomjames
ID: 20412379
I'm sure the second one would error out if you didn't change out where I put this;

{perhaps the date of the recurring charge?}

I don't know your tables...I can't fill that in. It was really just supposed to be an example.

I have to ask you again, this is important if you are still getting duplicates; what fields are different in the duplicates? Those, obviously, are the cause of the duplicate records...
0
 
LVL 3

Expert Comment

by:randomjames
ID: 20412404
No, what you tried would not help. Every field that isn't summarized in some way (with a max, sum, count, min, etc) needs to be in the group by section.

Besides, that first example I gave you would give you this highest "last_amount" which may or may not be what you are looking for.
0
 

Author Comment

by:looknow12
ID: 20412449
Ahh in the address table, the contact_phone.

Any way to place the first duplicate record's contact field in one column and the next duplicate field in another column.  
0
 
LVL 3

Expert Comment

by:randomjames
ID: 20412609
Here's a way. Notice this will return just one fax number, so if there are two you need to modify this as well. Also, you would still need to fix the line with {perhaps the date of the recurring charge?};
select
        lease_pml_lease.lease,
        tenant_pmn_tenant.name,
        lease_pml_lease.property,
        lease_pml_lease.primary_unit,
        lease_pml_lease.status,
        Get_last_amount_table.last_amount,
        Get_both_phones.contact_phone1,
        Get_both_phones.contact_phone2,
        Get_both_phones.Contact_fax,
        lease_pml_lease.Original_Start,
        lease_pml_lease.actual_start,
        lease_pml_lease.Original_Lease_end,
        lease_pml_lease.Actual_lease_end,
        lease_pml_lease.Actual_Movein_date,
        lease_pml_lease.actual_moveout_date
from 
        lease_pml_lease
        inner join tenant_pmn_tenant on  primary_tenant = tenant_pmn_tenant.tenant
        inner join 
		(select
			tenant_pmn_tenant_address.tenant,
			max(tenant_pmn_tenant_address.Contact_fax) as contact_fax,
			max(tenant_pmn_tenant_address.contact_phone) as contact_phone1,
			min(tenant_pmn_tenant_address.contact_phone) as contact_phone2
		from
			tenant_pmn_tenant_address
		group by
			tenant_pmn_tenant_address.tenant
		) as Get_both_phones on on primary_tenant = Get_both_phones.tenant
        inner join 
                (select
                        lease_pml_lease_recurring_charge.lease,
                        lease_pml_lease_recurring_charge.last_amount,
                        max(lease_pml_lease_recurring_charge.{perhaps the date of the recurring charge?}),
                from
                        lease_pml_lease_recurring_charge
                where
                        lease_pml_lease_recurring_charge.charge_type='Rent'
                group by
                        lease_pml_lease_recurring_charge.lease,
                        lease_pml_lease_recurring_charge.last_amount
                ) as Get_last_amount_table on lease_pml_lease.lease = Get_last_amount_table.lease
where 
        status='Current' 

Open in new window

0
 

Author Comment

by:looknow12
ID: 20412705
General Error :-( Is there any punctuation missing?
0
 
LVL 3

Expert Comment

by:randomjames
ID: 20412774
Did you replace this;

    {perhaps the date of the recurring charge?}

with the appropriate date field?

I do see a punctuation error...this line needs to have the comma removed as well;

                        max(lease_pml_lease_recurring_charge.{perhaps the date of the recurring charge?}),
0
 

Author Comment

by:looknow12
ID: 20412975
I'm running this.  BTW, I don't care about the date field.  Also, ease_pml_lease_recurring_charge.charge_type can be "Rent"
0
 

Author Comment

by:looknow12
ID: 20413049
ohh, I see you have Rent as the charge type.  I've been looking at this through remote desktop and just now adjusted my resolution so I can see more of the query on the screen.  It was tough to look at the entire query.
0
 

Author Comment

by:looknow12
ID: 20413158
The other field that should be unique (not sure if this matters, but) and tenant.name, lease.property.

But I am good with the lease or the name only.  If those are unique it is fine.
0
 
LVL 3

Accepted Solution

by:
randomjames earned 2000 total points
ID: 20413433
If they are not unique then of course they will result in duplicates so it's important that all fields are unique.

If other fields are not unique then they will need to be handled in a similar way or you can just add grouping to the main query.

This is not optimized SQL and I would expect that it could be streamlined. I was just hoping to convey general principles in my examples.
0
 
LVL 1

Expert Comment

by:Computer101
ID: 20962568
Forced accept.

Computer101
Community Support Moderator
0

Featured Post

Important Lessons on Recovering from Petya

In their most recent webinar, Skyport Systems explores ways to isolate and protect critical databases to keep the core of your company safe from harm.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction Hopefully the following mnemonic and, ultimately, the acronym it represents is common place to all those reading: Please Excuse My Dear Aunt Sally (PEMDAS). Briefly, though, PEMDAS is used to signify the order of operations (http://en.…
Recursive SQL is one of the most fascinating and powerful and yet dangerous feature offered in many modern databases today using a Common Table Expression (CTE) first introduced in the ANSI SQL 99 standard. The first implementations of CTE began ap…
Are you ready to place your question in front of subject-matter experts for more timely responses? With the release of Priority Question, Premium Members, Team Accounts and Qualified Experts can now identify the emergent level of their issue, signal…
Is your data getting by on basic protection measures? In today’s climate of debilitating malware and ransomware—like WannaCry—that may not be enough. You need to establish more than basics, like a recovery plan that protects both data and endpoints.…

783 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question