Link to home
Start Free TrialLog in
Avatar of cantthinkofone
cantthinkofoneFlag for United Kingdom of Great Britain and Northern Ireland

asked on

Perl : read a file and output only distinct lines (eliminate all duplicates)

I have a file which has content something like the following:

01 A line of text
02 Another line of text
03 Yet another one
01 A line of text
04 More text
02 Another line of text
01 A line of text


How do I read this file and output, to another file, only the "unique" rows, ie:

01 A line of text
02 Another line of text
03 Yet another one
04 More text

is this easy to do in a Perl script?

As always, any help much apprfeciated!
ASKER CERTIFIED SOLUTION
Avatar of ozo
ozo
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of funtaff
funtaff

I'm sorry, but that was just beautiful.
Based on what ozo said, you want something like this.  Make sure to give ozo the credit.


#!/usr/bin/perl
 
$in_file = "input.txt";
$out_file = "output.txt";
 
open(INFILE, $in_file) or die "Cannot open input file";
open(OUTFILE, ">$out_file") or die "Cannot open output file";
 
while( <INFILE> )
{
    print OUTFILE unless $seen{$_}++;
}
 
close(INFILE);
close(OUTFILE);

Open in new window

Avatar of cantthinkofone

ASKER

Brilliant - thanks!  I'm always surprised when something I have thought to be tricky (for me) turns out to be so few lines of code :)  

Thanks for the extra bit too funtaff!