Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 196
  • Last Modified:

Pattern Match

Hi, I have a log file that I need to parse. I found this script and tweaked it a little, I am a perl newbie and am really stuck.

Here is my log format:

user1.domain.local - - [17/Mar/2003:08:21:16 -0500] "GET http://wisapidata.weatherbug.com/WxDataISAPI/WxDataISAPI.cgi?GetCData&Magic=10991&RegNum=3098527&ZipCode=07054&StationID=PRSPP&Units=0&Version=3.5&Fore=1&t=1047907707&lv=0 HTTP/1.1" - - "-" "Mozilla/3.0 (compatible; MSIE 4.0; Win32)"

user2.domain.local - - [17/Mar/2003:08:21:18 -0500] "GET http://news.yahoo.com/news?tmpl=index2&cid=757 HTTP/1.1" - - "http://news.yahoo.com/" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"

The Script:

my $data = { };
while (<LOG>) {
        /^([\w\.]+) .*(GET|POST) (.*?) HTTP\//;
        my ($user, $url) = ($1, $3);
        $data->{$user}->{$url}++;
}

foreach my $user (keys %$data){
        print "User: $user\n\n";
        my $uref = $data->{$user};
        foreach my $url (keys %$uref) {
                print " $url (".$uref->{$url}." hits)\n";
                print "\n"; }
}

Yeild Sample:

User: user1.domain.local

 http://us.greet1.yimg.com/img.greetings.yahoo.com/g/img/rubber/trs_pat_ya02.gif (1 hits)
 
 http://us.i1.yimg.com/us.yimg.com/i/i16/mov_popc.gif (1 hits)
 
 http://www.ibc-uk.com/img/ILM/website.gif (1 hits)


How do I fix this to drop everything after the domain name in the report

Thank you for your help.
0
beerbar
Asked:
beerbar
  • 3
  • 2
1 Solution
 
bebonhamCommented:
Hi can you try this?
we'll keep all the data but print only what you need?

my $data = { };
while (<LOG>) {
       /^([\w\.]+) .*(GET|POST) (.*?) HTTP\//;
       my ($user, $url) = ($1, $3);
       $data->{$user}->{$url}++;
}

foreach my $user (keys %$data){
       print "User: $user\n\n";
       my $uref = $data->{$user};
       foreach my $url (keys %$uref) {
               print substr($url,0,index($url,"/",9)) . "(".$uref->{$url}." hits)\n";
               print "\n"; }
}
0
 
ozoCommented:
while (<LOG>) {
       next unless /^([\w\.]+) .*(GET|POST) ([^\/]*\/\/[^\/]*).* HTTP\//;
       my ($user, $url) = ($1, $3);
       $data->{$user}->{$url}++;
}
0
 
beerbarAuthor Commented:
Worked like a charm, thanks! Is there a way to drop all domains that are not ours?  I see that our log file contains inbound as well as outbound http requests, we just need the outbound or   domain.local stuff in the report.

Thank You again...
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
ozoCommented:
next unless /^([\w\.]+) .*(GET|POST) ([^\/]*\/\/[^\/]*yimg\.com).* HTTP\//;
0
 
beerbarAuthor Commented:
Sorry about the confusion, I posted more info before but it must have made it to the bit bucket.

What I meant to say was users from our local domain

Yeild Sample:

User: user1.domain.local
    web site
    web site

In the output I see users are outside users as well as inside users. The actual url's listed are perfect. Below is user2 making an outbound http request to yaoo.com, but the server logs inbound as well so in the data we also have a user name that may be googlebot.google.com because their bot came in to our web server. I would only like to show *.domain.local or the IP address of 192.168.1.* as users if possible.

user2.domain.local - - [17/Mar/2003:08:21:18 -0500] "GET http://news.yahoo.com/news?tmpl=index2&cid=757 HTTP/1.1" - - "http://news.yahoo.com/" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT

Thanks again and again. Sorry about not being clear!
0
 
ozoCommented:
next unless /^([\w.]+domain\.local) .*(GET|POST) ([^\/]*\/\/[^\/]*).* HTTP\//;
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now