Solved

narrowing down a file

Posted on 2011-03-07
6
262 Views
Last Modified: 2012-05-11
I have two files.

File1.txt is tab-separated and contains several fields.
File2.txt contains just one field.

I want to get a subset of file 1, such that field 3 of it matches exactly(1) one of the lines in File2.txt.  

The match has to be complete, not partial (so if one says foxnews.com/blah.html and the other says foxnews.com or vice versa - that would not be a match).  Otherwise, I could have just done grep –F –f File2.txt File1.txt.

How would I do that in a shell script?
0
Comment
Question by:aturetsky
  • 3
  • 2
6 Comments
 
LVL 16

Accepted Solution

by:
sjklein42 earned 300 total points
ID: 35064864
Should work.  Save as "joinIt.pl".

Without any test data, hard to test.  If it doesn't work, please post some test data.

# usage:   perl joinIt.pl infile.txt keyfile.txt

$infile = shift(@ARGV);
$keyfile = shift(@ARGV);

if ( ! open(INFILE, "<$infile") ) { die "*** can't open $infile: $!\n"; }
if ( ! open(KEYFILE, "<$keyfile") ) { die "*** can't open $keyfile: $!\n"; }

while ( <KEYFILE> )
{
	s/[\r\n]//g;
	$key{$_} = 1;
}

while ( <INFILE> )
{
	s/[\r\n]//g;
	@x = split(/\t/);
	if ( $key{$x[2]} ) { print "$_\n"; }
}

Open in new window

0
 
LVL 8

Expert Comment

by:point_pleasant
ID: 35069520
here is a shellscript that should work too, the delimiter in the cut commaned is a tab


for i in `cat file1 | cut -f3 -d'       '`
do
        grep -x $i file2
done
0
 
LVL 1

Author Comment

by:aturetsky
ID: 35072123
thanks, sjklein42 - it worked!

can I ask you - if I wanted to do the exact opposite and get only what's not in the keyfile - what would that look like?
0
Courses: Start Training Online With Pros, Today

Brush up on the basics or master the advanced techniques required to earn essential industry certifications, with Courses. Enroll in a course and start learning today. Training topics range from Android App Dev to the Xen Virtualization Platform.

 
LVL 1

Author Comment

by:aturetsky
ID: 35072688
actually, I think this might do the job (for that reverse task):


# usage:   perl joinIt.pl infile.txt keyfile.txt

$infile = shift(@ARGV);
$keyfile = shift(@ARGV);

if ( ! open(INFILE, "<$infile") ) { die "*** can't open $infile: $!\n"; }
if ( ! open(KEYFILE, "<$keyfile") ) { die "*** can't open $keyfile: $!\n"; }

while ( <KEYFILE> )
{
       s/[\r\n]//g;
       $key{$_} = 1;
}

while ( <INFILE> )
{
       s/[\r\n]//g;
       @x = split(/\t/);
       if ( not exists($key{$x[2]}) ) { print "$_\n"; }
}
0
 
LVL 8

Expert Comment

by:point_pleasant
ID: 35073184
the shell script to do the reverse would be as follows.  the echo statement is there to seperate each column 3 element from file1

for i in `cat file1 | cut -f3 -d'       '`
do
        echo ================== $i from file1 ======================
        grep -x -v $i file2
done
0
 
LVL 8

Assisted Solution

by:point_pleasant
point_pleasant earned 200 total points
ID: 35073548
sorry didn'd realize you wanted the whole line from file1.  Here is a shell script to do it.  if you want the reverse just add the -v option to grep


while read i
do
        col3=`echo $i | awk '{ print $3 }'`
        found=`grep -x -v $col3 file2`
        if [ "$found" != "" ]
        then
                echo $i
        fi
done < file1
0

Featured Post

Live: Real-Time Solutions, Start Here

Receive instant 1:1 support from technology experts, using our real-time conversation and whiteboard interface. Your first 5 minutes are always free.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
CMD shell elevation.js 4 64
Unix How to Bulk Remove Multiple Directories from FileSystem 10 77
Call Shell Script from Perl Script 6 97
Bash Script to Analyze Oracle Schemas 11 102
Over the years I've spent many an hour playing on hardened, DMZ'd servers, with only a sub-set of the usual GNU toy's to keep me company; frequently I've needed to save and send log or data extracts from these server back to my PC, or to others, and…
Utilizing an array to gracefully append to a list of EmailAddresses
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…

786 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question