Solved

Remove items from one list that exist in another list

Posted on 2014-03-27
6
391 Views
Last Modified: 2014-03-28
I have two lists of email addresses.

List A contains a comprehensive list of all the email addresses
List B contains a smaller subset of ListA's email addresses

I need to remove all of the items from List A that appear in List B.

So basically, I need to obtain List C which is all of the items in List A that do NOT appear in List B.

What's a simple way to do this? I only need to do it once, and the lists are small (2000 items each), so I'm open to pretty much anything.

I can do it in pretty much whatever tools you think would be easiest to use - Notepad, Excel, Notepad++, Bash script, Linux commands, PHP script, regular expressions, VB... whatever you like.
0
Comment
Question by:Frosty555
6 Comments
 
LVL 39

Accepted Solution

by:
nutsch earned 200 total points
ID: 39960689
put listA in column A of a worksheet, list B in column D of a workseet

in cell B1, put the following formula and copy it down
=countif(D:D,A1)>0

this will give you a true / false for matches in list B

you can sort and copy, or Data \ AUtofilter to either delete the trues, or copy the falses to a new list C.

Thomas
0
 
LVL 13

Assisted Solution

by:Carl Bohman
Carl Bohman earned 100 total points
ID: 39960688
Assuming your big list is called "a" and your small list is called "b", this set of commands should do it:

sort a > a.sorted
sort b > b.sorted
diff a.sorted b.sorted | grep "^<" | sed 's/^..//' > outputfile

Open in new window

0
 
LVL 48

Assisted Solution

by:Tintin
Tintin earned 100 total points
ID: 39960731
With a bash script, it's trival.

#!/bin/bash
grep -vf listb.txt lista.txt >listc.txt

Open in new window

0
3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

 
LVL 34

Assisted Solution

by:Dan Craciun
Dan Craciun earned 100 total points
ID: 39960790
Easy in Powershell too:
Compare-Object (Get-Content "X:\your\path\listA.txt") (Get-Content "X:\your\path\listB.txt") | Select-Object InputObject | Out-File "X:\your\path\listC.txt"

Open in new window

HTH,
Dan
0
 
LVL 8

Expert Comment

by:itjockey
ID: 39960801
0
 
LVL 31

Author Comment

by:Frosty555
ID: 39962590
Tried out each of your answers and they all worked.

nutsch's answer with using Excel gives you the most visual cues that you really did do it right which was nice for a one-time operation and ultimately was the way I ended up doing it. It is N^2 complexity, though so beyond a few thousand rows you'll quickly run into performance issues. Worked nicely in this case, though.

Tintin's answer was definitely the simplest. However, you have to be careful because ListB.txt is now a collection of Grep patterns, not literal strings. I would have to escape all the "." characters in listb.txt for it to be completely correct. In this case, though, it appears to work.

The "sort" and the Powershell solutions appear to work too but admittedly I don't fully understand how it works, because I don't do much work in Powershell and the diff and sed commands are some of the few linux commands I still haven't wrapped my head around.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Recently, an awarded photographer, Selina De Maeyer (http://www.selinademaeyer.com/), completed a photo shoot of a beautiful event (http://www.sintjacobantwerpen.be/verslag-en-fotoreportage-van-de-sacramentsprocessie-door-antwerpen#thumbnails) in An…
This article will show, step by step, how to integrate R code into a R Sweave document
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …
This Micro Tutorial will demonstrate how to use longer labels with horizontal bar charts instead of the vertical column chart.

930 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

8 Experts available now in Live!

Get 1:1 Help Now