• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 74
  • Last Modified:

Remove Duplicates

Experts, I have a query and there are duplicates.  Is there a simple way to remove the duplicates from within the query itself?
0
pdvsa
Asked:
pdvsa
  • 7
  • 3
  • 3
  • +2
1 Solution
 
SimonCommented:
use the DISTINCT keyword as in
SELECT DISTINCT col1,col2 FROM table

Open in new window


but usually preferable to work out why there are duplicates. Often the query can be re-written to avoid them without using DISTINCT.

Can you post the query itself?
0
 
ThomasMcA2Commented:
Simon is right, the reason why there are duplicates is often important. If you are joining multiple tables, duplicates can happen if the join criteria is incomplete. For example, you joined the tables using FieldA and FieldB, but forgot FieldC. The only way to know if that is the problem is to analyze the data and the query.
0
 
pdvsaAuthor Commented:
Hi, I am importing an excel file and the duplicates are recurring items that are for example due each month.  

I will send the query sql when I get to my computer
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
pdvsaAuthor Commented:
SELECT Import_Excel.*
FROM Import_Excel;

I do have an ID field which would violate the true duplicate nature of the row.  I am not certain how to somehow not look at this [ID] field.
0
 
SimonCommented:
>the duplicates are recurring items that are for example due each month

It sounds as though you could use DISTINCT in this context IF you just want a list of unique items, some of which may be recurring.

If your item field name is [ItemName], rather than Select * you use

SELECT DISTINCT [ItemName] from Import_Excel 

Open in new window

This method only works if you don't include the ID in the list of fields.

The group by method will also avoid duplicates
SELECT [ItemName] from Import_Excel 
GROUP BY [ItemName]

Open in new window

0
 
pdvsaAuthor Commented:
Simon,

I need to have more columns than only [ItemName].
I dont believe either of those have any other columns than [ItemName]?
0
 
hnasrCommented:
Assume imported Excel file:
2  5
2  5
1  1
3  3
3  3
3  3

You need to keep:
2  5
1  1
3  3

Have access table t(id autonumber, a, b)
Append excel file to table t
id      a      b
1      2      5
2      2      5
3      1      1
4      3      3
5      3      3
6      3      3

Query to delete duplicate records:
delete *  
from t
where id > (select min(id) from t x where t.a=x.a and t.b=x.b)

Open in new window


Result:
id      a      b
1      2      5
3      1      1
4      3      3
0
 
SimonCommented:
Re:
>I dont believe either of those have any other columns than [ItemName]?

Those were just simplest examples. You can have any number of columns in distinct or grouped output, but I didn't know your column names.

The DISTINCT keyword causes just the columns that are included in the select list to be compared for uniqueness.

If you were using the group by method, you'd group by each column you included unless you're using an aggregate function on that column.

e.g.
select ItemName, ItemDescription, Max(ItemDate)
from Import_Excel
Group by ItemName, ItemDescription
0
 
Gustav BrockCIOCommented:
> SELECT Import_Excel.*
> FROM Import_Excel;

> I do have an ID field which would violate the true duplicate nature of the row.

If you for dupes want one set of values only - excluding the ID - you need to pick one ID only.
Could be something like:

SELECT
    First(ID) As OneID, FieldFirst, FieldSecond, FieldLast
FROM
    Import_Excel
GROUP BY
    FieldFirst, FieldSecond, FieldLast

/gustav
0
 
pdvsaAuthor Commented:
hi gustav,

can you explain this a little further:
"If you for dupes want one set of values only - excluding the ID - you need to pick one ID only."
0
 
Gustav BrockCIOCommented:
> I do have an ID field which would violate the true duplicate nature of the row.

Then - to ignore the dupes - you will have to exclude the ID when using Distinct or Group By.
But if you in the result set wish to have an ID, you can select one with Min, Max, First, or Last([ID]) and use Group By on all other fields.

/gustav
0
 
pdvsaAuthor Commented:
Thank you GUSTAV, I will try when have a second.  Removing duplicates seems to be more tricky than I thought. I thought there would be an easier way for people like me.  I have not asked a question like this in some time now and thought that Access would have added a remove duplicates function just like excel has (it's really simple).  Anyways, thank you for the help.  

Also, I see that a few experts have responded with the GROUP BY function and I am not certain if that will work for me because I have dates and i fear some records being grouped that are not duplicates. I guess I will have to add all the fields and not only a few.
0
 
Gustav BrockCIOCommented:
> thought that Access would have added a remove duplicates function just like excel has ..

There is a wizard doing so. But from your question it seemed that you wish the dupes not to enter the database during the import. That's way smarter than dealing with it afterwards.

> .. fear some records being grouped that are not duplicates

That will not happen if you set Group By on all fields.

/gustav
0
 
pdvsaAuthor Commented:
Thank you GUSTAV.

But from your question it seemed that you wish the dupes not to enter the database during the import.
==>the dupes are in the DB.  Maybe I can try the wizard as you mentioned.
0
 
pdvsaAuthor Commented:
Thank you.  It works.
0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 7
  • 3
  • 3
  • +2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now