Explain this line of code: push @lines, $_ unless $h{$_}++;

In a previous question I needed to remove duplicate lines from a file. Ren_b provided a solution that has been working fine. I would like a detailed explanation of the line that got everything working. Here is the code:

open GEN, "gen.txt";
my @lines;
my %h;
while(<GEN>){
  push @lines, $_ unless $h{$_}++;
}
close GEN;

open GEN, ">gen.txt";
print GEN @lines;
close GEN;

My best guess- read each line of GEN, and add it to the array @lines if it does not match $h. I then print the array to GEN. I don't understand what is happening after the unless, is a hash being created at the same time the array is and if the hash sees the same key again does the autoincrement cause it to skip it.
omcrAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ozoCommented:
add it to the array @lines if it is not true in hash %h,
after checking whether it is true, increment its value in %h, so that it will be true tne next time you see it.
0
ozoCommented:
you could also do
perl -i -ne 'print if !$h{$_}++' gen.txt
0
godspropyCommented:
push @lines, $_ unless $h{$_}++;

Perl actually increments the variable before the condition is processed. Therefore the hash $h{$_} is created and incremented to 0 on the first occurance of $_. On the next occurance of the same $_ the value in the hash %h is incremented to 1.  The unless keyword is basically identical to 'if (! $h{$_})', it returns true for false values. So, this line only pushes the value to the array @lines if it is the first occurance of $_.
0
Cloud Class® Course: C++ 11 Fundamentals

This course will introduce you to C++ 11 and teach you about syntax fundamentals.

TintinCommented:
Might also be useful to mention the FAQ here:

$ perldoc -q duplicate

Found in /usr/perl5/5.6.1/lib/pod/perlfaq4.pod
     How can I remove duplicate elements from a list or array?

     There are several possible ways, depending on whether the
     array is ordered and whether you wish to preserve the
     ordering.

             a)  If @in is sorted, and you want @out to be
                 sorted:  (this assumes all true values in the
                 array)

                     $prev = "not equal to $in[0]";
                     @out = grep($_ ne $prev && ($prev = $_, 1), @in);

                 This is nice in that it doesn't use much extra
                 memory, simulating uniq(1)'s behavior of
                 removing only adjacent duplicates.  The ", 1"
                 guarantees that the expression is true (so that
                 grep picks it up) even if the $_ is 0, "", or
                 undef.

             b)  If you don't know whether @in is sorted:

                     undef %saw;
                     @out = grep(!$saw{$_}++, @in);

             c)  Like (b), but @in contains only small integers:

                     @out = grep(!$saw[$_]++, @in);

             d)  A way to do (b) without any loops or greps:

                     undef %saw;
                     @saw{@in} = ();
                     @out = sort keys %saw;  # remove sort if undesired

             e)  Like (d), but @in contains only small positive
                 integers:

                     undef @ary;
                     @ary[@in] = @in;
                     @out = grep {defined} @ary;

             But perhaps you should have been using a hash all
             along, eh?
0
omcrAuthor Commented:
So first run:  Add $_ to hash and increment to zero. Add $_ to the array.
Then it comes in again (duplicate): Already exists in the hash, increment to 1, this makes the '$h{$_}' true, the unless see's that its true and causes it to return false which prevents it from doing the push.

Is this right ???      
0
TintinCommented:
Not quite right.

The increment of the hash and population of the array happen at the same time.

So the first time though, $_ gets added to the array and the hash with the key $_ gets incremented to 1 (TRUE).  The next time around with a duplicate value, the hash is already set to true.
0
omcrAuthor Commented:
From godspropy's post
"Therefore the hash $h{$_} is created and incremented to 0 on the first occurance of $_. On the next occurance of the same $_ the value in the hash %h is incremented to 1."

From tintin's post
" So the first time though, $_ gets added to the array and the hash with the key $_ gets incremented to 1 (TRUE).

Are you both speaking of the same thing ?

0
TintinCommented:
No.

The hash doesn't get incremented to 0, it gets incremented to 1.
0
godspropyCommented:
Tintin was correct. When used in a condition an auto-incremented variable returns the current value and then increments (and undef=0. So, it returns 0 for the undefined variable on its first use. After its first use it is auto-incremented to 1. On its second use it returns the existing value of 1 and then increments...
0
omcrAuthor Commented:
Ok
The first time through, $_ gets added to the array and the hash with the key $_ gets incremented to 1 (TRUE).
Then it comes in again (duplicate): Already exists in the hash, increment to N, this makes the '$h{$_}' true, the 'unless' see's that its true and causes it to return false which prevents it from doing the push.

How does that description look ?
0
ozoCommented:
The first time through, $_ gets added to the array and the hash with the key $_
has value NULL (false), after the false value is tested, it gets incremented to 1 (TRUE).
the ! sees the false value from before the increment, and causes the push
Then it comes in again (duplicate): Already exists in the hash, with the (TRUE) value that was set the first time through. after the true value has been tested, it gets incremented to 2 (also true)
the ! sees the true value and prevents it from doing the push.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
omcrAuthor Commented:
Thanks everyone, I think I've got it now. Good discussion and thanks for the patience.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Perl

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.