Solved

How do I compare two excel spreadsheets and remove the entries found on both?

Posted on 2014-01-13
13
194 Views
Last Modified: 2014-02-23
I am using Office 2013 with two Excel spreadsheets that have email addresses on them.  I need to compare sheet 1 to sheet 2 then remove the email addresses that are found on the second sheet.
0
Comment
Question by:jer2eydevil88
  • 7
  • 3
  • 3
13 Comments
 
LVL 18

Expert Comment

by:Richard Daneke
ID: 39777927
If I have the question correct, you are trying to add only the new email addresses from the second worksheet onto the first worksheet?

The steps I am recommending will preserve your original worksheets.

1. Copy the first worksheet into a new workbook.
2. Copy the email addresses from the second worksheet and paste them below the addresses on the list copy created in step 1.   This will add all the email addresses from the second worksheet after the first worksheet.
3. Click on any cell in the list of email addresses.
4. Click on Data to use the Data Ribbon.
5. Click on Remove Duplicates in the Data Tools ribbon group.
6. Click OK - this will remove all duplicated lines in the data.  IF you only want to delete lines where the email address is duplicated, uncheck the other column heading boxes leaving only your email address column selected.  Then, click OK to remove the duplicates.

Feel free to ask additional questions.
0
 
LVL 3

Author Comment

by:jer2eydevil88
ID: 39777963
Quick question about #6: Will this remove both examples of the same email or just the first one it comes accross?  My goal is to eradicate the data entirely not leave a copy of it in place.
0
 
LVL 18

Expert Comment

by:Richard Daneke
ID: 39778011
This will remove the  first line it comes across.  

If you want to remove the duplicated email addresses , both lines, you need to do the following.  

1. Highlight the column of email addresses.
2. From the Home ribbon, choose Conditional Formatting in the Styles ribbon group.
3. From this menu, select Highlight Cell Rules, then, Duplicate Values.
   This will highlight all duplicated email addresses.
4. Remove the lines with highlighted email addresses
   (If you sort the table by email address, but sort by color, they will all be grouped together)
0
 
LVL 3

Author Comment

by:jer2eydevil88
ID: 39778025
There are a lot of these, how can I automate the removal?
0
 
LVL 18

Expert Comment

by:Richard Daneke
ID: 39778934
(If you sort the table by email address, but sort by color, they will all be grouped together)


Click on a cell in your list.
From the Data ribbon, choose Sort.
In Sort By, choose your email address column heading.
In Sort On, Choose Cell Color.
Click OK to Sort.

Highlight the cells with duplicates, right-click in the highlight.
Choose Delete... from the menu and choose Entire Row from the Window.

Your duplicates are now removed.

You could record the steps as a macro if you have to do it often.
Or, one could write a macro to handle this if you have to do it often.
That was not the part of your question.
0
 
LVL 31

Expert Comment

by:Rob Henson
ID: 39778993
Alternatively, rather than using Conditional Formatting to highlight use a COUNT in a column alongside the e-mail addresses.

Assuming addreses in column A from row 1 to 10,000, in column B:

=COUNTIF($A$1:$A$10000,$A1)

Adjust the column and row references accordingly and copy down as far as required. the result will be 1 for single entries or >1 for entries with duplicates.

Apply a filter to this column and hide those with value 1. Then select the remaining rows and Delete. Or sort on this column and delete those not required.

Thanks
Rob H
0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 3

Author Comment

by:jer2eydevil88
ID: 39786392
In Workbook 1 (my master list) I have a header row that shows the following data:
Last Name      
First Name      
Company      
Office Phone      
Office Fax      
Mobile      
Email      
FWG      
IWG      
QWG      
PWG

While in Workbook 2 (list of emails I need to remove) I only have:
Email

Can you provide instructions that will sort all the columns in Workbook 1 so the data matches up and I can remove the entire line including the duplicate email.
0
 
LVL 31

Expert Comment

by:Rob Henson
ID: 39786478
Still not really clear what you are trying to achieve.

You have a sheet with 11 columns, 1 of which is email. Are you trying to identify which of those are also in your second list?
0
 
LVL 3

Author Comment

by:jer2eydevil88
ID: 39786481
Yes and remove the lines from the first that have those emails.
0
 
LVL 31

Expert Comment

by:Rob Henson
ID: 39787957
In your more extensive list copy the e-mail list onto a separate sheet (assuming Sheet2 and columnA) and add a column of formulas checking the existence of the email:

=MATCH(G2,Sheet2!$A:$A,0)

Assumed email in extensive list in column G from list of headers above.

For those that exist in both, this will give the row number from the secondary list or an error for those that don't. You can then filter on this column and delete rows.

If you also want to remove from the secondary list, in column B of the secondary sheet:

=MATCH(A2,Sheet1!$G:$G,0)

Again will return row number or error.

Thanks
Rob H
0
 
LVL 3

Author Comment

by:jer2eydevil88
ID: 39800933
I'll give this a try and report back.
0
 
LVL 3

Accepted Solution

by:
jer2eydevil88 earned 0 total points
ID: 39869585
I tried all this and kept ending up with different problems.  I ended up having a software developer look at the issue and he wrote me an application in Python that can remove the duplicates for me.  

Thanks for trying to help!
0
 
LVL 3

Author Closing Comment

by:jer2eydevil88
ID: 39880367
I guess Excel can't do this.
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

Recently Microsoft released a brand new function called CONCAT. It's supposed to replace its predecessor CONCATENATE. But how does it work? And what's new? In this article, we take a closer look at all of this - we even included an exercise file for…
Using Word 2013, I was experiencing some incredible lag when typing.  Here's what worked for me....
The viewer will learn how to create a normally distributed random variable in Excel, use a normal distribution to simulate the return on an investment over a period of years, Create a Monte Carlo simulation using a normal random variable, and calcul…
This Micro Tutorial will demonstrate the scrolling table in Microsoft Excel using the INDEX function.

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now