Solved

narrowing down a file

Posted on 2011-03-07
6
265 Views
Last Modified: 2012-05-11
I have two files.

File1.txt is tab-separated and contains several fields.
File2.txt contains just one field.

I want to get a subset of file 1, such that field 3 of it matches exactly(1) one of the lines in File2.txt.  

The match has to be complete, not partial (so if one says foxnews.com/blah.html and the other says foxnews.com or vice versa - that would not be a match).  Otherwise, I could have just done grep –F –f File2.txt File1.txt.

How would I do that in a shell script?
0
Comment
Question by:aturetsky
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
6 Comments
 
LVL 16

Accepted Solution

by:
sjklein42 earned 300 total points
ID: 35064864
Should work.  Save as "joinIt.pl".

Without any test data, hard to test.  If it doesn't work, please post some test data.

# usage:   perl joinIt.pl infile.txt keyfile.txt

$infile = shift(@ARGV);
$keyfile = shift(@ARGV);

if ( ! open(INFILE, "<$infile") ) { die "*** can't open $infile: $!\n"; }
if ( ! open(KEYFILE, "<$keyfile") ) { die "*** can't open $keyfile: $!\n"; }

while ( <KEYFILE> )
{
	s/[\r\n]//g;
	$key{$_} = 1;
}

while ( <INFILE> )
{
	s/[\r\n]//g;
	@x = split(/\t/);
	if ( $key{$x[2]} ) { print "$_\n"; }
}

Open in new window

0
 
LVL 8

Expert Comment

by:point_pleasant
ID: 35069520
here is a shellscript that should work too, the delimiter in the cut commaned is a tab


for i in `cat file1 | cut -f3 -d'       '`
do
        grep -x $i file2
done
0
 
LVL 1

Author Comment

by:aturetsky
ID: 35072123
thanks, sjklein42 - it worked!

can I ask you - if I wanted to do the exact opposite and get only what's not in the keyfile - what would that look like?
0
Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
LVL 1

Author Comment

by:aturetsky
ID: 35072688
actually, I think this might do the job (for that reverse task):


# usage:   perl joinIt.pl infile.txt keyfile.txt

$infile = shift(@ARGV);
$keyfile = shift(@ARGV);

if ( ! open(INFILE, "<$infile") ) { die "*** can't open $infile: $!\n"; }
if ( ! open(KEYFILE, "<$keyfile") ) { die "*** can't open $keyfile: $!\n"; }

while ( <KEYFILE> )
{
       s/[\r\n]//g;
       $key{$_} = 1;
}

while ( <INFILE> )
{
       s/[\r\n]//g;
       @x = split(/\t/);
       if ( not exists($key{$x[2]}) ) { print "$_\n"; }
}
0
 
LVL 8

Expert Comment

by:point_pleasant
ID: 35073184
the shell script to do the reverse would be as follows.  the echo statement is there to seperate each column 3 element from file1

for i in `cat file1 | cut -f3 -d'       '`
do
        echo ================== $i from file1 ======================
        grep -x -v $i file2
done
0
 
LVL 8

Assisted Solution

by:point_pleasant
point_pleasant earned 200 total points
ID: 35073548
sorry didn'd realize you wanted the whole line from file1.  Here is a shell script to do it.  if you want the reverse just add the -v option to grep


while read i
do
        col3=`echo $i | awk '{ print $3 }'`
        found=`grep -x -v $col3 file2`
        if [ "$found" != "" ]
        then
                echo $i
        fi
done < file1
0

Featured Post

Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The following is a collection of cases for strange behaviour when using advanced techniques in DOS batch files. You should have some basic experience in batch "programming", as I'm assuming some knowledge and not further explain the basics. For some…
Over the years I've spent many an hour playing on hardened, DMZ'd servers, with only a sub-set of the usual GNU toy's to keep me company; frequently I've needed to save and send log or data extracts from these server back to my PC, or to others, and…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…

695 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question