Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 508
  • Last Modified:

Compare fields from two different hash of hashes

I have two apache logs, one with all POSTs and one with all GETs. I am trying to read each of them into their own array of hashes (or better way if you have a suggestion). My end result should be me comparing the two hashes, looking for IPs that match. If I find two records with the same IP, I then want to compare the User Agent, if those match, then compare the two times to see if they both happened within an hour time.

I am not loading my hashes properly so I can do these comparison checks. Please let me know where I am going wrong. The $hRequests isn't being created, and only one record is being returned.

This error pops up for every row.
Use of uninitialized value in hash element at ./script.pl line 71, <LOG> line 10069.



Each row is being place inside %data correctly, and the keys are the vars in the logformat var. (IE: '%h' is a key)
#!/usr/bin/perl -w
 
use Apache::LogRegex;
use Data::Dumper;
 
  my $lr;
my $log_format  = '"%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\""';
  eval { $lr = Apache::LogRegex->new($log_format) };
  die "Unable to parse log line: $@" if ($@);
  
my $get_logs = ("march-logs/march-get.txt",
				"march-logs/march-logs-web2/march-get.txt",
				"march-logs/march-logs-web3/march-get.txt");
				
my $post_logs = ("march-logs/march-post.txt",
				 "march-logs/march-logs-web2/march-post.txt",
				 "march-logs/march-logs-web3/march-post.txt");
 
my %data;
my %getRecords;
my $postRecords;
my @get_array;
my @post_array;
 
foreach ($get_logs)
{
	@get_array = &logToHash($_);
	foreach(@get_array)
	{
		print Dumper($_);	
	}
	
}
 
sub logToHash
{
	my $file = $_;
	open LOG, $file or die $!;
	our ($aRequests,$ip,$userAgent,$date,$hRequests,$host);
	
	while ( my $line_from_logfile = <LOG> ) 
  	{
      eval { %data = $lr->parse($line_from_logfile); };
      if (%data) 
      {
          # We have data to process
          while( my ($key, $value) = each(%data) ) 
          {
		  	if($key =~ '%h')
		  	{
		  		($host,$ip) = split(/:/, $value);
		  	}
		  	if($key =~ '%{User-Agent}i\""')
		  	{
		  		$userAgent = $value;
		  	}
		  	if($key =~ '%t')
		  	{
		  		$date = $value;
		  	}
          }
          $aRequests = $hRequests{$ip}{$userAgent}{$date};  ////LINE 71
   		  push @$aRequests, \%data;
   		}
  	} 
  	return @$aRequests;
}

Open in new window

0
hallikpapa
Asked:
hallikpapa
  • 7
  • 3
2 Solutions
 
ozoCommented:
Did you mean
my @get_logs = ("march-logs/march-get.txt",
                                "march-logs/march-logs-web2/march-get.txt",
                                "march-logs/march-logs-web3/march-get.txt");


foreach (@get_logs)
0
 
hallikpapaAuthor Commented:
Oops, yes, that should be that way.
0
 
hallikpapaAuthor Commented:
Problem still remains though
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
ozoCommented:
where did %hRequests come from?  I don't see it defined
0
 
hallikpapaAuthor Commented:
Yeah it hasn't been defined. This is the whole app I posted. I am really rusty on my perl and can't seem to figure out how to accomplish my goal
0
 
ozoCommented:
what goal are you trying to accomplish with that line?
0
 
hallikpapaAuthor Commented:
I am trying to make it my key, basically a HofHofHofH. Then I can use the IP, date, and UserAgent as keys, do comparisons.

And then push the whole row into that $aRequests array like:

push @$aRequests, \%data;

Is that right?

Then SOMEHOW compare on each var to see how many matches I have on each level of the hash, kind of like the code below. The loop below shows me doing each comparison one key at a time, and I will track the number of matches at each level. Hope that part makes sense.

BUT, I haven't gotten far enough to test the code below, because of the error I got in the original post and it doesn't load all the data correct so I can do comparisons between the two arrays.

If I am going down the wrong path, please let me know. The end goal is to be able to look for matching IPs between the GET and POST arrays, and for each IP match, check to see if those IPs have matching User Agents, and if they DO, then finally check to see if they both happened within an hour or two.





 while (my ($IP, $hUserAgents) = each(%hRequests)) {
      next if #IP is boring;
      while (my ($userAgent, $hDates = each(%$hUserAgents)) {
          next if #user agent is boring;
          while (my ($date, $aRequests) = each(%$hDates)) {
             #do something if date is in range
             #wanted for $IP, $userAgent
          }
      }
   }

Open in new window

0
 
hallikpapaAuthor Commented:
When I do the print Dumper line in the code below, it only prints one log entry. So something is not being pushed back correctly? I have switched it a bit. I am only searching GET requests, and going to try and do a match based on time stamp, then user agent against a table in the DB.

Am I going about this all wrong? Again, the end result should be to search the GET requests for a time stamp that falls within one hour of any request in the DB table. When I find a match, I want to check the user agent for that GET request and see if it also matches that same record in the DB table. If it does, yay. I am going to extract something from that GET request. If it doesn't, move on.

Please help, this is extremely frustrating.



foreach (@get_logs)
{
	@get_array = &logToHash($_);
}
 
foreach(@get_array)
	{
		print Dumper($_);	
	}
 
 
sub logToHash
{
	my $file = $_;
	my @AoH;
	open LOG, $file or die $!;
	our ($aRequests,$ip,$userAgent,$date,$hRequests,$host);
	
	while ( my $line_from_logfile = <LOG> ) 
  	{
      eval { %data = $lr->parse($line_from_logfile); };
      if (%data) 
      {
          # We have data to process
          while( my ($key, $value) = each(%data) ) 
          {
		  	if($key =~ '%{User-Agent}i\""')
		  	{
		  		$userAgent = $value;
		  	}
		  	if($key =~ '%t')
		  	{
		  		$date = $value;
		  	}
          }
          $aRequests = $hRequests{$date}{$userAgent};
   		  push @$aRequests, \%data;
   		}
  	} 
  	return @$aRequests;
}

Open in new window

0
 
hallikpapaAuthor Commented:
Thanks for the tips. I believe I am close, but not there yet. This section of code doesn't seem to be operating as I expect?

I am using eclipse and even though I breakpoint and see the hash keys %{User-Agent}i\"" & %t, those if statements are never satisfied. It loops through the entire hash, but the $key never changes from %{Referer}i.


What am I doing wrong?



      if (%data) 
      {
          # We have data to process
          while( my ($key, $value) = each(%data) ) 
          {
		  	if($key =~ '%{User-Agent}i\""')
		  	{
		  		$userAgent = $value;
		  	}
		  	if($key =~ '%t')
		  	{
		  		$date = $value;
		  	}
          }
          $aRequests = $hRequests{$date}{$userAgent};
   		  push @$aRequests, \%data;
   		}

Open in new window

0
 
hallikpapaAuthor Commented:
I found my own solution
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 7
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now