remove duplicate lines

i have a file with information in it like this
word comment
where word is a single word and comment is a single word comment about a word....

i add in some information to the end of the file (always in the same format...word comment) but sometimes i add in the same information

would there be a way to, after i add in the information, remove all duplicate lines in the file?
thanks
paul
paulwhelanAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

MindoCommented:
I quite not understand your task. Is your task like this?:

Given a text file remove all the duplicate lines.

If we have a file:

==================
First line.
Second line.
Second line.
Third line.
==================

The resulting file would be:

==================
First line.
Second line.
Third line.
==================

if so, i can write it and get points, although i'm not guru at Perl :-)
0
paulwhelanAuthor Commented:
yes thats what i want
thanks
paul
0
MindoCommented:
#!/usr/local/bin/perl -w

######################################
# Mindaugas Genutis,                 #
# mindg@nomagiclt.com                #
######################################
# usage: perl filter.pl <file>       #
# The program removes duplicate      #
# lines from a given file and leaves #
# old file by adding a suffix .orig  #
# to its end.                        #
######################################

my $old = shift or die "usage: $0 <filename>\n";
$new = "new.txt";

open(OLD, "< $old") or die "can't open $old: $!";
open(NEW, "> $new") or die "can't open $new: $!";

select(NEW);

%lines = ();

while(<OLD>)
{
  if(!exists($lines{$_}))
  {
    $lines{$_} = 1;
    print NEW $_ or die "can't write $new: $!";
  }
};

close(OLD) or die "can't close $old: $!";
close(NEW) or die "can't close $new: $!";
rename($old, "$old.orig") or die "can't rename $old to $old.orig: $!";
rename($new, "$old") or die "can't rename $new to $old: $!";
0
Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

MindoCommented:
Keep in mind, it removes only the absolutely identical lines from a file. And it assumes that you do not have a file new.txt on your current directory. It uses file new.txt as a temporary file.

Cheers :-)
0
paulwhelanAuthor Commented:
is there a way to do this without the temporary file...
a lot of people would be accessing this at the same time
it might lead to errors
0
MindoCommented:
Here is the version without a temporary file :-)

#!/usr/local/bin/perl -w

######################################
# Mindaugas Genutis,                 #
# mindg@nomagiclt.com                #
######################################
# usage: perl filter.pl <file>       #
# The program removes duplicate      #
# lines from a given file.           #
######################################

my $file = shift or die "usage: $0 <filename>\n";

open(F, "+< $file") or die "can't open $file: $!";

%lines = ();
$out = '';

while(<F>)
{
  if(!exists($lines{$_}))
  {
    $lines{$_} = 1;
    $out .= $_;
  }
};

seek(F, 0, 0) or die "can't seek to start of $file: $!";
print F $out or die "can't print to $file: $!";
truncate(F, tell(F)) or die "can't truncate $file: $!";
close(F) or die "can't close $file: $!";
0
ozoCommented:
flock to prevent accessing at the same time
0
paulwhelanAuthor Commented:
ozo do u know a script to do this with flock?
thanks
paul
0
paulwhelanAuthor Commented:
does this work for u i cant get it to work
it doesnt delete the duplicate lines.....
0
MindoCommented:
Yes, it does delete duplicate lines for me. I think your lines aren't precisely duplicate. No one will help if you need to remove duplicate lines which aren't duplicate :-)

Given the file:

==================
Second Line.
Third Line.
Second Line.
Fourth Line.
Ninth Line.
Third Line.
First Line.
Second Line.
==================

I run the command line:

$perl filter.pl file.txt

The resulting file is:

==================
First Line.
Second Line.
Third Line.
Fourth Line.
Ninth Line.
==================

So it works for me. Give me your files. I'll check this out.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
MindoCommented:
#!/usr/local/bin/perl -w

######################################
# Mindaugas Genutis,                 #
# mindg@nomagiclt.com                #
######################################
# usage: perl filter.pl <file>       #
# The program removes duplicate      #
# lines from a given file.           #
######################################

use Fcntl ':flock'; # import LOCK_* constants

my $file = shift or die "usage: $0 <filename>\n";

open(F, "+< $file") or die "can't open $file: $!";
flock(F, LOCK_EX); # Lock the file.

%lines = ();
$out = '';

while(<F>)
{
  if(!exists($lines{$_}))
  {
    $lines{$_} = 1;
    $out .= $_;
  }
};

seek(F, 0, 0) or die "can't seek to start of $file: $!";
print F $out or die "can't print to $file: $!";
truncate(F, tell(F)) or die "can't truncate $file: $!";
close(F) or die "can't close $file: $!";

flock(F, LOCK_UN); # Unlock the file.
0
MindoCommented:
I've uploaded a version with flock() - (above). It works, i don't know what is your data. You should adjust this example to your case. I think it's enough.
0
paulwhelanAuthor Commented:
the last code u posted was, as u said, perfect
sorry it took so long to grade
paul
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Perl

From novice to tech pro — start learning today.