?
Solved

Remove Duplicates

Posted on 2015-02-17
15
Medium Priority
?
72 Views
Last Modified: 2015-03-23
Experts, I have a query and there are duplicates.  Is there a simple way to remove the duplicates from within the query itself?
0
Comment
Question by:pdvsa
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 3
  • 3
  • +2
15 Comments
 
LVL 18

Expert Comment

by:Simon
ID: 40614682
use the DISTINCT keyword as in
SELECT DISTINCT col1,col2 FROM table

Open in new window


but usually preferable to work out why there are duplicates. Often the query can be re-written to avoid them without using DISTINCT.

Can you post the query itself?
0
 
LVL 14

Expert Comment

by:ThomasMcA2
ID: 40614768
Simon is right, the reason why there are duplicates is often important. If you are joining multiple tables, duplicates can happen if the join criteria is incomplete. For example, you joined the tables using FieldA and FieldB, but forgot FieldC. The only way to know if that is the problem is to analyze the data and the query.
0
 

Author Comment

by:pdvsa
ID: 40614803
Hi, I am importing an excel file and the duplicates are recurring items that are for example due each month.  

I will send the query sql when I get to my computer
0
What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

 

Author Comment

by:pdvsa
ID: 40614913
SELECT Import_Excel.*
FROM Import_Excel;

I do have an ID field which would violate the true duplicate nature of the row.  I am not certain how to somehow not look at this [ID] field.
0
 
LVL 18

Accepted Solution

by:
Simon earned 2000 total points
ID: 40615060
>the duplicates are recurring items that are for example due each month

It sounds as though you could use DISTINCT in this context IF you just want a list of unique items, some of which may be recurring.

If your item field name is [ItemName], rather than Select * you use

SELECT DISTINCT [ItemName] from Import_Excel 

Open in new window

This method only works if you don't include the ID in the list of fields.

The group by method will also avoid duplicates
SELECT [ItemName] from Import_Excel 
GROUP BY [ItemName]

Open in new window

0
 

Author Comment

by:pdvsa
ID: 40615220
Simon,

I need to have more columns than only [ItemName].
I dont believe either of those have any other columns than [ItemName]?
0
 
LVL 30

Expert Comment

by:hnasr
ID: 40615253
Assume imported Excel file:
2  5
2  5
1  1
3  3
3  3
3  3

You need to keep:
2  5
1  1
3  3

Have access table t(id autonumber, a, b)
Append excel file to table t
id      a      b
1      2      5
2      2      5
3      1      1
4      3      3
5      3      3
6      3      3

Query to delete duplicate records:
delete *  
from t
where id > (select min(id) from t x where t.a=x.a and t.b=x.b)

Open in new window


Result:
id      a      b
1      2      5
3      1      1
4      3      3
0
 
LVL 18

Expert Comment

by:Simon
ID: 40615283
Re:
>I dont believe either of those have any other columns than [ItemName]?

Those were just simplest examples. You can have any number of columns in distinct or grouped output, but I didn't know your column names.

The DISTINCT keyword causes just the columns that are included in the select list to be compared for uniqueness.

If you were using the group by method, you'd group by each column you included unless you're using an aggregate function on that column.

e.g.
select ItemName, ItemDescription, Max(ItemDate)
from Import_Excel
Group by ItemName, ItemDescription
0
 
LVL 51

Expert Comment

by:Gustav Brock
ID: 40615394
> SELECT Import_Excel.*
> FROM Import_Excel;

> I do have an ID field which would violate the true duplicate nature of the row.

If you for dupes want one set of values only - excluding the ID - you need to pick one ID only.
Could be something like:

SELECT
    First(ID) As OneID, FieldFirst, FieldSecond, FieldLast
FROM
    Import_Excel
GROUP BY
    FieldFirst, FieldSecond, FieldLast

/gustav
0
 

Author Comment

by:pdvsa
ID: 40615418
hi gustav,

can you explain this a little further:
"If you for dupes want one set of values only - excluding the ID - you need to pick one ID only."
0
 
LVL 51

Expert Comment

by:Gustav Brock
ID: 40616160
> I do have an ID field which would violate the true duplicate nature of the row.

Then - to ignore the dupes - you will have to exclude the ID when using Distinct or Group By.
But if you in the result set wish to have an ID, you can select one with Min, Max, First, or Last([ID]) and use Group By on all other fields.

/gustav
0
 

Author Comment

by:pdvsa
ID: 40616398
Thank you GUSTAV, I will try when have a second.  Removing duplicates seems to be more tricky than I thought. I thought there would be an easier way for people like me.  I have not asked a question like this in some time now and thought that Access would have added a remove duplicates function just like excel has (it's really simple).  Anyways, thank you for the help.  

Also, I see that a few experts have responded with the GROUP BY function and I am not certain if that will work for me because I have dates and i fear some records being grouped that are not duplicates. I guess I will have to add all the fields and not only a few.
0
 
LVL 51

Expert Comment

by:Gustav Brock
ID: 40616406
> thought that Access would have added a remove duplicates function just like excel has ..

There is a wizard doing so. But from your question it seemed that you wish the dupes not to enter the database during the import. That's way smarter than dealing with it afterwards.

> .. fear some records being grouped that are not duplicates

That will not happen if you set Group By on all fields.

/gustav
0
 

Author Comment

by:pdvsa
ID: 40616450
Thank you GUSTAV.

But from your question it seemed that you wish the dupes not to enter the database during the import.
==>the dupes are in the DB.  Maybe I can try the wizard as you mentioned.
0
 

Author Closing Comment

by:pdvsa
ID: 40682663
Thank you.  It works.
0

Featured Post

On Demand Webinar: Networking for the Cloud Era

Did you know SD-WANs can improve network connectivity? Check out this webinar to learn how an SD-WAN simplified, one-click tool can help you migrate and manage data in the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Phishing attempts can come in all forms, shapes and sizes. No matter how familiar you think you are with them, always remember to take extra precaution when opening an email with attachments or links.
The Windows Phone Theme Colours is a tight, powerful, and well balanced palette. This tiny Access application makes it a snap to select and pick a value. And it doubles as an intro to implementing WithEvents, one of Access' hidden gems.
Using Microsoft Access, learn some simple rules for how to construct tables in a relational database. Split up all multi-value fields into single values: Split up fields that belong to other things into separate tables: Make sure that all record…
Visualize your data even better in Access queries. Given a date and a value, this lesson shows how to compare that value with the previous value, calculate the difference, and display a circle if the value is the same, an up triangle if it increased…
Suggested Courses

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question