Solved

avoiding repeats

Posted on 2009-05-14
7
174 Views
Last Modified: 2012-05-07
I have a file that looks like this:

Name:Bill;Location:Miami;Age:27;
Name:Claudette; Location:Detroit;Age:50;
Name:Dave;Location:Florence;Age:25;
Name:Thomas;Location:Miami;Age:27;
Name:Bill;Location:Chicago;Age:47;

And I would like to skip lines that are repeated, lines repeated are for example:
Name:Bill;Location:Miami;Age:27;
Name:Bill;Location:Chicago;Age:47;

that have the same Name, the rest doesnt matter.
I have a routine that will read the file line by line and split it twice, semicolon first and then colon. Following this, it will make an array of hashes. How can avoid repeating the same name with the following code: Thanks!
sub read{                

   my $input = shift;                             

   open(FILE, $input);

   my @names;        

   while (<FILE>) {                                        

        chomp;

        my @lines = map { s/^\s+//; s/\s+//; $_} split( ';', $_ );

        next if /^\s*(?:#|$)/;

        for my $element (@lines) { 

               my ($entry,$value) = split( ':', $element);

               $hash{$entry} = $value;              

      }

push(@names, {%hash});                         

 }close(FILE);           

return @names;                                     

}

Open in new window

0
Comment
Question by:cucugirl
  • 4
  • 3
7 Comments
 

Author Comment

by:cucugirl
ID: 24387001
How can avoid repeating pushing the same name into the array of hashes? Thanks!

0
 
LVL 1

Accepted Solution

by:
berseken earned 500 total points
ID: 24388627
The best thing from an efficiency point of view would probably be to sort the file outside Perl so that the file comes in sorted by name and then you can just keep track of the name on the previous line that came in and if the current line has the same name you just ignore it.

If you can't do that you will probably have to keep a hash of previously seen names as implemented here:

sub read{                
   my $input = shift;                            
   open(FILE, $input);
   my @names;        
   my %seen;
   while (<FILE>) {                                        
        chomp;
        my @lines = map { s/^\s+//; s/\s+//; $_} split( ';', $_ );
        my ($name) = ($_ =~ /^Name:(\w*);/);
        next if (exists $seen{$name});
        $seen{$name} = 1;
        next if /^\s*(?:#|$)/;
        for my $element (@lines) {
               my ($entry,$value) = split( ':', $element);
               $hash{$entry} = $value;                            
      }
   push(@names, {%hash});                        
 }close(FILE);          
return @names;                                    
0
 
LVL 1

Expert Comment

by:berseken
ID: 24388727
also.. you should probably define %hash in the read subroutine or it is going to keep growing and consume all your memory.
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 

Author Comment

by:cucugirl
ID: 24388862
I tried implementing the changes, but it will only print the first line in the fileand I'm sure in my list I have probably just 2 repeated right now.. do you think there's a bug probably somewhere?
0
 
LVL 1

Expert Comment

by:berseken
ID: 24389307
don't know.. this works fine and i dump all the lines in /tmp/blah

I did run into an issue with calling the function 'read'..
use Data::Dumper;
 

sub read1{

   my $input = shift;

   open(FILE, $input);

   my @names;

   my %seen;

   while (<FILE>) {

        chomp;

        my @lines = map { s/^\s+//; s/\s+//; $_} split( ';', $_ );

        my ($name) = ($_ =~ /^Name:(\w*);/);

        next if (exists $seen{$name});

        $seen{$name} = 1;

        next if /^\s*(?:#|$)/;

        for my $element (@lines) {

               my ($entry,$value) = split( ':', $element);

               $hash{$entry} = $value;

      }

     push(@names, {%hash});

  }

  close(FILE);

  return @names;

}
 
 

my @thing = read1("/tmp/blah");
 

print Dumper(\@thing);

Open in new window

0
 

Author Comment

by:cucugirl
ID: 24389433
where did you declare %hash?
0
 

Author Comment

by:cucugirl
ID: 24406539
hi, for another part of my code i need to push only the last one, and not the first one..
Name:Bill;Location:Miami;Age:27;
Name:Claudette; Location:Detroit;Age:50;
Name:Dave;Location:Florence;Age:25;
Name:Thomas;Location:Miami;Age:27;
Name:Bill;Location:Chicago;Age:47;

i would push
Name:Bill;Location:Chicago;Age:47; rather than
Name:Bill;Location:Miami;Age:27; does anybody know how to do this? With the same routine I had in the beginning? thanks!!!!
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

Suggested Solutions

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…
Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
When you create an app prototype with Adobe XD, you can insert system screens -- sharing or Control Center, for example -- with just a few clicks. This video shows you how. You can take the full course on Experts Exchange at http://bit.ly/XDcourse.

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now