Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

Remove items from one list that exist in another list

Posted on 2014-03-27
6
417 Views
Last Modified: 2014-03-28
I have two lists of email addresses.

List A contains a comprehensive list of all the email addresses
List B contains a smaller subset of ListA's email addresses

I need to remove all of the items from List A that appear in List B.

So basically, I need to obtain List C which is all of the items in List A that do NOT appear in List B.

What's a simple way to do this? I only need to do it once, and the lists are small (2000 items each), so I'm open to pretty much anything.

I can do it in pretty much whatever tools you think would be easiest to use - Notepad, Excel, Notepad++, Bash script, Linux commands, PHP script, regular expressions, VB... whatever you like.
0
Comment
Question by:Frosty555
6 Comments
 
LVL 39

Accepted Solution

by:
nutsch earned 200 total points
ID: 39960689
put listA in column A of a worksheet, list B in column D of a workseet

in cell B1, put the following formula and copy it down
=countif(D:D,A1)>0

this will give you a true / false for matches in list B

you can sort and copy, or Data \ AUtofilter to either delete the trues, or copy the falses to a new list C.

Thomas
0
 
LVL 13

Assisted Solution

by:Carl Bohman
Carl Bohman earned 100 total points
ID: 39960688
Assuming your big list is called "a" and your small list is called "b", this set of commands should do it:

sort a > a.sorted
sort b > b.sorted
diff a.sorted b.sorted | grep "^<" | sed 's/^..//' > outputfile

Open in new window

0
 
LVL 48

Assisted Solution

by:Tintin
Tintin earned 100 total points
ID: 39960731
With a bash script, it's trival.

#!/bin/bash
grep -vf listb.txt lista.txt >listc.txt

Open in new window

0
DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

 
LVL 34

Assisted Solution

by:Dan Craciun
Dan Craciun earned 100 total points
ID: 39960790
Easy in Powershell too:
Compare-Object (Get-Content "X:\your\path\listA.txt") (Get-Content "X:\your\path\listB.txt") | Select-Object InputObject | Out-File "X:\your\path\listC.txt"

Open in new window

HTH,
Dan
0
 
LVL 8

Expert Comment

by:itjockey
ID: 39960801
0
 
LVL 31

Author Comment

by:Frosty555
ID: 39962590
Tried out each of your answers and they all worked.

nutsch's answer with using Excel gives you the most visual cues that you really did do it right which was nice for a one-time operation and ultimately was the way I ended up doing it. It is N^2 complexity, though so beyond a few thousand rows you'll quickly run into performance issues. Worked nicely in this case, though.

Tintin's answer was definitely the simplest. However, you have to be careful because ListB.txt is now a collection of Grep patterns, not literal strings. I would have to escape all the "." characters in listb.txt for it to be completely correct. In this case, though, it appears to work.

The "sort" and the Powershell solutions appear to work too but admittedly I don't fully understand how it works, because I don't do much work in Powershell and the diff and sed commands are some of the few linux commands I still haven't wrapped my head around.
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Recently, an awarded photographer, Selina De Maeyer (http://www.selinademaeyer.com/), completed a photo shoot of a beautiful event (http://www.sintjacobantwerpen.be/verslag-en-fotoreportage-van-de-sacramentsprocessie-door-antwerpen#thumbnails) in An…
Do you use a spreadsheet like Microsoft's Excel?  Have you ever wanted to link out to a non excel file on your computer or network drive?  This is the way I found to do it!
This Micro Tutorial demonstrates how to create Excel charts: column, area, line, bar, and scatter charts. Formatting tips are provided as well.
In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…

860 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question