Solved

# Check for all duplicates problem, Array and Hash HELP !

Posted on 2006-05-22
375 Views
Ok, day 3 now and im stuck again.
I have looked every where and tried quite a few examples and cannot count how many
duplicate values are in an array, so every where I looked said you shoud use a hash,
still I cant get it.

I use a 1 regexp to popultate this list, like
@dupecheck1,"\$2,\$3,\$6,\$8,\$1"; ###   \$2= date \$3=time \$6=ip  \$8=badword \$1=sesionnum Regexp looks for incorrect!

The other regexp,

@dupcheck =(@dupecheck1,@dupcheck2);  ## Only did it this way for trouble shootng !!!
#  So now I can Add to hash %user the badword from above \$8 now in the array as dupecheck[3],

foreach \$item(@dupecheck){
if (\$dupecheck[3]){
\$user{\$dupecheck[3]}++;
}
\$counter++; #end foreach \$item(@dupecheck)
}

\$duplicates = 0;# declare var and set to 0

#for each name in the hash "user", if the count is greater than 1 it is a duplicate so print
foreach \$a(keys %user){
if (\$user{\$a}>1){
print "count for duplicate user [\$a] is [\$user{\$a}]\n";

\$duplicates = (\$duplicates+(\$user{\$a}-1)); # add to total duplicate count (minus 1 because 3 entries means only 2 duplicates)
}#end of if
}#end of foreach

\$percent = (100/\$maincounter)*\$duplicates;#calculate percentage from results above

#Lets print our results:
print "total count = [\$maincounter]\n";
print "total number of duplicates = [\$duplicates]\n";
print "percentage of duplicates = [\$percent]\n";

ouput is .....

count for duplicate user [5/16/2006,9:34:49,221.141.0.194,Administrator,(000013)] is [122]
total count = [122]
total number of duplicates = [121]
percentage of duplicates = [99.1803278688525]

Now the problem is that in in this instance incorrect! is ignored, If I
change @dupcheck =(@dupecheck1,@dupcheck2); to
@dupcheck =(@dupecheck2,@dupcheck1);

then the oppisite happens, Administrator is ignored ...

so I found that this example only will find matches on what ever filled \$dupecheck[3] the first time

OK, so if you have made it this far
I need to check for all duplicates,  see if the dupes are over an amount \$trigger
and if over populate an array with unique ips.

I dont care if we (You ;) use a %hash to begin with, but please show the right way to populate it using the regexp captures..

This one is worth the points to me because I need this yesterday, I have spent way too much time on it already.
If you can please comment the code  ;)

GHBoom

0
Question by:ghboom

LVL 84

Assisted Solution

I use a 1 regexp to popultate this list, like
@dupecheck1,"\$2,\$3,\$6,\$8,\$1"; ###   \$2= date \$3=time \$6=ip  \$8=badword \$1=sesionnum Regexp looks for incorrect!

I don't understand.  There is no regexp there, and a varible and a string in void context does not populate a list.

You can check for duplicates in @dupecheck with
foreach \$item(@dupecheck){
\$user{\$item}++;
}

\$duplicates = 0;# declare var and set to 0

#for each name in the hash "user", if the count is greater than 1 it is a duplicate so print
foreach \$a(keys %user){
if (\$user{\$a}>1){
print "count for duplicate user [\$a] is [\$user{\$a}]\n";

\$duplicates = (\$duplicates+(\$user{\$a}-1)); # add to total duplicate count (minus 1 because 3 entries means only 2 duplicates)
}#end of if
0

LVL 4

Accepted Solution

maybe you wanted to something like:

@dupcheck =(\@dupecheck1,\@dupcheck2);  # Array of an Array

foreach \$item (@dupecheck){ #for each array element
if (\${\$item}[3]){
\$user{\${\$item}[3]}++;
}
\$maincounter++; #end foreach \$item(@dupecheck)
}

your example has \$user{\$dupecheck[3]}++; but \$dupecheck[3] stays the same throughout the foreach loop since @dupecheck isn't changed !
0

LVL 17

Expert Comment

I was going to suggest the method Ozo suggests.  Looking at your code a bit further, you must be parsing something (presumably like 5/16/2006,9:34:49,221.141.0.194,Administrator,(000013)) with a regular expression earlier on, and putting these values in an array, which you then want to check for duplicates?

Would it not be easier to put these pieces directly into a hash that could track duplicates for you, rather than putting them in an array first and then iterating over the array and into a hash?
0

Author Comment

I found a solution, using both replies that fit the bill.

My Bad for the delay !!!

:)

Thanks for helping..
GHBoom
0

## Featured Post

### Suggested Solutions

json encode mysql results 4 176
binary to char / hexadecimal 5 93
perl script 4 86
Perl string replace for refred url 9 49
Email validation in proper way is  very important validation required in any web pages. This code is self explainable except that Regular Expression which I used for pattern matching. I originally published as a thread on my website : http://www…
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
To add imagery to an HTML email signature, you have two options available to you. You can either add a logo/image by embedding it directly into the signature or hosting it externally and linking to it. The vast majority of email clients display l…