Solved

Remove Duplicates

Posted on 2015-02-17
15
67 Views
Last Modified: 2015-03-23
Experts, I have a query and there are duplicates.  Is there a simple way to remove the duplicates from within the query itself?
0
Comment
Question by:pdvsa
  • 7
  • 3
  • 3
  • +2
15 Comments
 
LVL 18

Expert Comment

by:Simon
ID: 40614682
use the DISTINCT keyword as in
SELECT DISTINCT col1,col2 FROM table

Open in new window


but usually preferable to work out why there are duplicates. Often the query can be re-written to avoid them without using DISTINCT.

Can you post the query itself?
0
 
LVL 14

Expert Comment

by:ThomasMcA2
ID: 40614768
Simon is right, the reason why there are duplicates is often important. If you are joining multiple tables, duplicates can happen if the join criteria is incomplete. For example, you joined the tables using FieldA and FieldB, but forgot FieldC. The only way to know if that is the problem is to analyze the data and the query.
0
 

Author Comment

by:pdvsa
ID: 40614803
Hi, I am importing an excel file and the duplicates are recurring items that are for example due each month.  

I will send the query sql when I get to my computer
0
 

Author Comment

by:pdvsa
ID: 40614913
SELECT Import_Excel.*
FROM Import_Excel;

I do have an ID field which would violate the true duplicate nature of the row.  I am not certain how to somehow not look at this [ID] field.
0
 
LVL 18

Accepted Solution

by:
Simon earned 500 total points
ID: 40615060
>the duplicates are recurring items that are for example due each month

It sounds as though you could use DISTINCT in this context IF you just want a list of unique items, some of which may be recurring.

If your item field name is [ItemName], rather than Select * you use

SELECT DISTINCT [ItemName] from Import_Excel 

Open in new window

This method only works if you don't include the ID in the list of fields.

The group by method will also avoid duplicates
SELECT [ItemName] from Import_Excel 
GROUP BY [ItemName]

Open in new window

0
 

Author Comment

by:pdvsa
ID: 40615220
Simon,

I need to have more columns than only [ItemName].
I dont believe either of those have any other columns than [ItemName]?
0
 
LVL 30

Expert Comment

by:hnasr
ID: 40615253
Assume imported Excel file:
2  5
2  5
1  1
3  3
3  3
3  3

You need to keep:
2  5
1  1
3  3

Have access table t(id autonumber, a, b)
Append excel file to table t
id      a      b
1      2      5
2      2      5
3      1      1
4      3      3
5      3      3
6      3      3

Query to delete duplicate records:
delete *  
from t
where id > (select min(id) from t x where t.a=x.a and t.b=x.b)

Open in new window


Result:
id      a      b
1      2      5
3      1      1
4      3      3
0
Enterprise Mobility and BYOD For Dummies

Like “For Dummies” books, you can read this in whatever order you choose and learn about mobility and BYOD; and how to put a competitive mobile infrastructure in place. Developed for SMBs and large enterprises alike, you will find helpful use cases, planning, and implementation.

 
LVL 18

Expert Comment

by:Simon
ID: 40615283
Re:
>I dont believe either of those have any other columns than [ItemName]?

Those were just simplest examples. You can have any number of columns in distinct or grouped output, but I didn't know your column names.

The DISTINCT keyword causes just the columns that are included in the select list to be compared for uniqueness.

If you were using the group by method, you'd group by each column you included unless you're using an aggregate function on that column.

e.g.
select ItemName, ItemDescription, Max(ItemDate)
from Import_Excel
Group by ItemName, ItemDescription
0
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 40615394
> SELECT Import_Excel.*
> FROM Import_Excel;

> I do have an ID field which would violate the true duplicate nature of the row.

If you for dupes want one set of values only - excluding the ID - you need to pick one ID only.
Could be something like:

SELECT
    First(ID) As OneID, FieldFirst, FieldSecond, FieldLast
FROM
    Import_Excel
GROUP BY
    FieldFirst, FieldSecond, FieldLast

/gustav
0
 

Author Comment

by:pdvsa
ID: 40615418
hi gustav,

can you explain this a little further:
"If you for dupes want one set of values only - excluding the ID - you need to pick one ID only."
0
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 40616160
> I do have an ID field which would violate the true duplicate nature of the row.

Then - to ignore the dupes - you will have to exclude the ID when using Distinct or Group By.
But if you in the result set wish to have an ID, you can select one with Min, Max, First, or Last([ID]) and use Group By on all other fields.

/gustav
0
 

Author Comment

by:pdvsa
ID: 40616398
Thank you GUSTAV, I will try when have a second.  Removing duplicates seems to be more tricky than I thought. I thought there would be an easier way for people like me.  I have not asked a question like this in some time now and thought that Access would have added a remove duplicates function just like excel has (it's really simple).  Anyways, thank you for the help.  

Also, I see that a few experts have responded with the GROUP BY function and I am not certain if that will work for me because I have dates and i fear some records being grouped that are not duplicates. I guess I will have to add all the fields and not only a few.
0
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 40616406
> thought that Access would have added a remove duplicates function just like excel has ..

There is a wizard doing so. But from your question it seemed that you wish the dupes not to enter the database during the import. That's way smarter than dealing with it afterwards.

> .. fear some records being grouped that are not duplicates

That will not happen if you set Group By on all fields.

/gustav
0
 

Author Comment

by:pdvsa
ID: 40616450
Thank you GUSTAV.

But from your question it seemed that you wish the dupes not to enter the database during the import.
==>the dupes are in the DB.  Maybe I can try the wizard as you mentioned.
0
 

Author Closing Comment

by:pdvsa
ID: 40682663
Thank you.  It works.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Today's users almost expect this to happen in all search boxes. After all, if their favourite search engine juggles with tens of thousand keywords while they type, and suggests matching phrases on the fly, why shouldn't they expect the same from you…
In Debugging – Part 1, you learned the basics of the debugging process. You learned how to avoid bugs, as well as how to utilize the Immediate window in the debugging process. This article takes things to the next level by showing you how you can us…
Familiarize people with the process of utilizing SQL Server functions from within Microsoft Access. Microsoft Access is a very powerful client/server development tool. One of the SQL Server objects that you can interact with from within Microsoft Ac…
Show developers how to use a criteria form to limit the data that appears on an Access report. It is a common requirement that users can specify the criteria for a report at runtime. The easiest way to accomplish this is using a criteria form that a…

863 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

27 Experts available now in Live!

Get 1:1 Help Now