Check for all duplicates problem, Array and Hash HELP !

Ok, day 3 now and im stuck again.
I have looked every where and tried quite a few examples and cannot count how many
duplicate values are in an array, so every where I looked said you shoud use a hash,
still I cant get it.

I use a 1 regexp to popultate this list, like
@dupecheck1,"$2,$3,$6,$8,$1"; ###   $2= date $3=time $6=ip  $8=badword $1=sesionnum Regexp looks for incorrect!

The other regexp,
@dupecheck2,"$2,$3,$6,$8,$1"; #$2= date $3=time $6=ip  $8=badword $1=sesionnum Regexp looks for Bad Usernames

@dupcheck =(@dupecheck1,@dupcheck2);  ## Only did it this way for trouble shootng !!!
#  So now I can Add to hash %user the badword from above $8 now in the array as dupecheck[3],

foreach $item(@dupecheck){
if ($dupecheck[3]){
$user{$dupecheck[3]}++;
}
$counter++; #end foreach $item(@dupecheck)
}

$duplicates = 0;# declare var and set to 0

#for each name in the hash "user", if the count is greater than 1 it is a duplicate so print
foreach $a(keys %user){
if ($user{$a}>1){
print "count for duplicate user [$a] is [$user{$a}]\n";

$duplicates = ($duplicates+($user{$a}-1)); # add to total duplicate count (minus 1 because 3 entries means only 2 duplicates)
}#end of if
}#end of foreach

$percent = (100/$maincounter)*$duplicates;#calculate percentage from results above

#Lets print our results:
print "total count = [$maincounter]\n";
print "total number of duplicates = [$duplicates]\n";
print "percentage of duplicates = [$percent]\n";


ouput is .....

count for duplicate user [5/16/2006,9:34:49,221.141.0.194,Administrator,(000013)] is [122]
total count = [122]
total number of duplicates = [121]
percentage of duplicates = [99.1803278688525]

Now the problem is that in in this instance incorrect! is ignored, If I
change @dupcheck =(@dupecheck1,@dupcheck2); to
           @dupcheck =(@dupecheck2,@dupcheck1);

then the oppisite happens, Administrator is ignored ...

so I found that this example only will find matches on what ever filled $dupecheck[3] the first time


OK, so if you have made it this far
I need to check for all duplicates,  see if the dupes are over an amount $trigger
and if over populate an array with unique ips.

I dont care if we (You ;) use a %hash to begin with, but please show the right way to populate it using the regexp captures..

This one is worth the points to me because I need this yesterday, I have spent way too much time on it already.
If you can please comment the code  ;)

GHBoom



ghboomAsked:
Who is Participating?
 
ps15Commented:
maybe you wanted to something like:

@dupcheck =(\@dupecheck1,\@dupcheck2);  # Array of an Array

foreach $item (@dupecheck){ #for each array element
      if (${$item}[3]){
            $user{${$item}[3]}++;
      }
      $maincounter++; #end foreach $item(@dupecheck)
}


your example has $user{$dupecheck[3]}++; but $dupecheck[3] stays the same throughout the foreach loop since @dupecheck isn't changed !
0
 
ozoCommented:
I use a 1 regexp to popultate this list, like
@dupecheck1,"$2,$3,$6,$8,$1"; ###   $2= date $3=time $6=ip  $8=badword $1=sesionnum Regexp looks for incorrect!

I don't understand.  There is no regexp there, and a varible and a string in void context does not populate a list.

You can check for duplicates in @dupecheck with
foreach $item(@dupecheck){
  $user{$item}++;  
}

$duplicates = 0;# declare var and set to 0

#for each name in the hash "user", if the count is greater than 1 it is a duplicate so print
foreach $a(keys %user){
if ($user{$a}>1){
print "count for duplicate user [$a] is [$user{$a}]\n";

$duplicates = ($duplicates+($user{$a}-1)); # add to total duplicate count (minus 1 because 3 entries means only 2 duplicates)
}#end of if
0
 
mjcoyneCommented:
I was going to suggest the method Ozo suggests.  Looking at your code a bit further, you must be parsing something (presumably like 5/16/2006,9:34:49,221.141.0.194,Administrator,(000013)) with a regular expression earlier on, and putting these values in an array, which you then want to check for duplicates?

Would it not be easier to put these pieces directly into a hash that could track duplicates for you, rather than putting them in an array first and then iterating over the array and into a hash?
0
 
ghboomAuthor Commented:
I found a solution, using both replies that fit the bill.

To be honest, I thought I had accepted this already,
My Bad for the delay !!!

:)

Thanks for helping..
GHBoom
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.