Solved

Remove Duplicates

Posted on 2015-02-17
15
65 Views
Last Modified: 2015-03-23
Experts, I have a query and there are duplicates.  Is there a simple way to remove the duplicates from within the query itself?
0
Comment
Question by:pdvsa
  • 7
  • 3
  • 3
  • +2
15 Comments
 
LVL 18

Expert Comment

by:SimonAdept
ID: 40614682
use the DISTINCT keyword as in
SELECT DISTINCT col1,col2 FROM table

Open in new window


but usually preferable to work out why there are duplicates. Often the query can be re-written to avoid them without using DISTINCT.

Can you post the query itself?
0
 
LVL 14

Expert Comment

by:ThomasMcA2
ID: 40614768
Simon is right, the reason why there are duplicates is often important. If you are joining multiple tables, duplicates can happen if the join criteria is incomplete. For example, you joined the tables using FieldA and FieldB, but forgot FieldC. The only way to know if that is the problem is to analyze the data and the query.
0
 

Author Comment

by:pdvsa
ID: 40614803
Hi, I am importing an excel file and the duplicates are recurring items that are for example due each month.  

I will send the query sql when I get to my computer
0
 

Author Comment

by:pdvsa
ID: 40614913
SELECT Import_Excel.*
FROM Import_Excel;

I do have an ID field which would violate the true duplicate nature of the row.  I am not certain how to somehow not look at this [ID] field.
0
 
LVL 18

Accepted Solution

by:
SimonAdept earned 500 total points
ID: 40615060
>the duplicates are recurring items that are for example due each month

It sounds as though you could use DISTINCT in this context IF you just want a list of unique items, some of which may be recurring.

If your item field name is [ItemName], rather than Select * you use

SELECT DISTINCT [ItemName] from Import_Excel 

Open in new window

This method only works if you don't include the ID in the list of fields.

The group by method will also avoid duplicates
SELECT [ItemName] from Import_Excel 
GROUP BY [ItemName]

Open in new window

0
 

Author Comment

by:pdvsa
ID: 40615220
Simon,

I need to have more columns than only [ItemName].
I dont believe either of those have any other columns than [ItemName]?
0
 
LVL 30

Expert Comment

by:hnasr
ID: 40615253
Assume imported Excel file:
2  5
2  5
1  1
3  3
3  3
3  3

You need to keep:
2  5
1  1
3  3

Have access table t(id autonumber, a, b)
Append excel file to table t
id      a      b
1      2      5
2      2      5
3      1      1
4      3      3
5      3      3
6      3      3

Query to delete duplicate records:
delete *  
from t
where id > (select min(id) from t x where t.a=x.a and t.b=x.b)

Open in new window


Result:
id      a      b
1      2      5
3      1      1
4      3      3
0
Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

 
LVL 18

Expert Comment

by:SimonAdept
ID: 40615283
Re:
>I dont believe either of those have any other columns than [ItemName]?

Those were just simplest examples. You can have any number of columns in distinct or grouped output, but I didn't know your column names.

The DISTINCT keyword causes just the columns that are included in the select list to be compared for uniqueness.

If you were using the group by method, you'd group by each column you included unless you're using an aggregate function on that column.

e.g.
select ItemName, ItemDescription, Max(ItemDate)
from Import_Excel
Group by ItemName, ItemDescription
0
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 40615394
> SELECT Import_Excel.*
> FROM Import_Excel;

> I do have an ID field which would violate the true duplicate nature of the row.

If you for dupes want one set of values only - excluding the ID - you need to pick one ID only.
Could be something like:

SELECT
    First(ID) As OneID, FieldFirst, FieldSecond, FieldLast
FROM
    Import_Excel
GROUP BY
    FieldFirst, FieldSecond, FieldLast

/gustav
0
 

Author Comment

by:pdvsa
ID: 40615418
hi gustav,

can you explain this a little further:
"If you for dupes want one set of values only - excluding the ID - you need to pick one ID only."
0
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 40616160
> I do have an ID field which would violate the true duplicate nature of the row.

Then - to ignore the dupes - you will have to exclude the ID when using Distinct or Group By.
But if you in the result set wish to have an ID, you can select one with Min, Max, First, or Last([ID]) and use Group By on all other fields.

/gustav
0
 

Author Comment

by:pdvsa
ID: 40616398
Thank you GUSTAV, I will try when have a second.  Removing duplicates seems to be more tricky than I thought. I thought there would be an easier way for people like me.  I have not asked a question like this in some time now and thought that Access would have added a remove duplicates function just like excel has (it's really simple).  Anyways, thank you for the help.  

Also, I see that a few experts have responded with the GROUP BY function and I am not certain if that will work for me because I have dates and i fear some records being grouped that are not duplicates. I guess I will have to add all the fields and not only a few.
0
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 40616406
> thought that Access would have added a remove duplicates function just like excel has ..

There is a wizard doing so. But from your question it seemed that you wish the dupes not to enter the database during the import. That's way smarter than dealing with it afterwards.

> .. fear some records being grouped that are not duplicates

That will not happen if you set Group By on all fields.

/gustav
0
 

Author Comment

by:pdvsa
ID: 40616450
Thank you GUSTAV.

But from your question it seemed that you wish the dupes not to enter the database during the import.
==>the dupes are in the DB.  Maybe I can try the wizard as you mentioned.
0
 

Author Closing Comment

by:pdvsa
ID: 40682663
Thank you.  It works.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

It took me quite some time to sort out all the different properties of combo and list boxes available from Visual Basic at run-time. Not that the documentation is lacking: the help pages are quite thorough and well written. The problem was rather wh…
Experts-Exchange is a great place to come for help with solutions for your database issues, and many problems are resolved within minutes of being posted.  Others take a little more time and effort and often providing a sample database is very helpf…
What’s inside an Access Desktop Database. Will look at the basic interface, Navigation Pane (Database Container), Tables, Queries, Forms, Report, Macro’s, and VBA code.
In Microsoft Access, when working with VBA, learn some techniques for writing readable and easily maintained code.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now