I created this code that will loop thru an array of term, and it will check if the terms are in a hash that contains some text information.

my hash has only 5 texts, and the terms in the hash deos not repeat. But when I print the numbers of time the term apear in the documents hash I get some crazy number (like 4000,or 380) and not 5 or 1 or 2.

here is the loop where I look at the terms, and search thru the hash.

The array of doc contains text information, so each place in the array has a text. the docTemp is used to get the terms from the doc array and store each term as a value inside the array.

for example
$doc[1] = "The mouse is black"

for ($counter = 0; $counter <= $#terms; $counter++){ $nDocs = 0; for ($count = 0; $count <= $#doc; $count++){ @docTemp = split(/\s+/, $doc[$count]); ################################### # STORE THE DOCUMENTS INTO A HASH # ################################### for my $word (@docTemp){ $docHash{$word}++; } for my $key ( keys %docHash ) { ################################ # CHECK IF TERM IS IN THE HASH # ################################ if ($terms[$counter] == $key){ $nDocs++; } } } print $terms[$counter], " ", $nDocs, "\n";}

if exists $docHash{$terms[$counter]} is true the first time through the for ($count = 0; $count <= $#doc; $count++) loop,
it will also be true the next time through the loop.
Is that what you want?

exists $docHash{$terms[$counter]} is true when the $terms[$counter] is in the %docHash hash
since you only accumulate entries in the hash, if it is ever in the hash, it will always be in the hash

