asked on

avoiding repeats

I have a file that looks like this:

Name:Bill;Location:Miami;Age:27;
Name:Claudette; Location:Detroit;Age:50;
Name:Dave;Location:Florence;Age:25;
Name:Thomas;Location:Miami;Age:27;
Name:Bill;Location:Chicago;Age:47;

And I would like to skip lines that are repeated, lines repeated are for example:
Name:Bill;Location:Miami;Age:27;
Name:Bill;Location:Chicago;Age:47;

that have the same Name, the rest doesnt matter.
I have a routine that will read the file line by line and split it twice, semicolon first and then colon. Following this, it will make an array of hashes. How can avoid repeating the same name with the following code: Thanks!

sub read{                
   my $input = shift;                             
   open(FILE, $input);
   my @names;        
   while (<FILE>) {                                        
        chomp;
        my @lines = map { s/^\s+//; s/\s+//; $_} split( ';', $_ );
        next if /^\s*(?:#|$)/;
        for my $element (@lines) { 
               my ($entry,$value) = split( ':', $element);
               $hash{$entry} = $value;              
      }
push(@names, {%hash});                         
 }close(FILE);           
return @names;                                     
}

Open in new window

cucugirl

ASKER

How can avoid repeating pushing the same name into the array of hashes? Thanks!

ASKER CERTIFIED SOLUTION

berseken

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

berseken

also.. you should probably define %hash in the read subroutine or it is going to keep growing and consume all your memory.

cucugirl

ASKER

I tried implementing the changes, but it will only print the first line in the fileand I'm sure in my list I have probably just 2 repeated right now.. do you think there's a bug probably somewhere?

berseken

don't know.. this works fine and i dump all the lines in /tmp/blah

I did run into an issue with calling the function 'read'..

use Data::Dumper;
 
sub read1{
   my $input = shift;
   open(FILE, $input);
   my @names;
   my %seen;
   while (<FILE>) {
        chomp;
        my @lines = map { s/^\s+//; s/\s+//; $_} split( ';', $_ );
        my ($name) = ($_ =~ /^Name:(\w*);/);
        next if (exists $seen{$name});
        $seen{$name} = 1;
        next if /^\s*(?:#|$)/;
        for my $element (@lines) {
               my ($entry,$value) = split( ':', $element);
               $hash{$entry} = $value;
      }
     push(@names, {%hash});
  }
  close(FILE);
  return @names;
}
 
 
my @thing = read1("/tmp/blah");
 
print Dumper(\@thing);

Open in new window

cucugirl

ASKER

where did you declare %hash?

cucugirl

ASKER

hi, for another part of my code i need to push only the last one, and not the first one..
Name:Bill;Location:Miami;Age:27;
Name:Claudette; Location:Detroit;Age:50;
Name:Dave;Location:Florence;Age:25;
Name:Thomas;Location:Miami;Age:27;
Name:Bill;Location:Chicago;Age:47;

i would push
Name:Bill;Location:Chicago;Age:47; rather than
Name:Bill;Location:Miami;Age:27; does anybody know how to do this? With the same routine I had in the beginning? thanks!!!!