Solved

GREP count occurrences

Posted on 2014-02-11
17
656 Views
Last Modified: 2014-02-20
Hi,
my O.S. is Linux and I have a log file (aud.log), each line in the log is like this (I've more 10000 lines):

02\/11\/2014 12:01:36 +0200 - SUCCESS - gmail.com  - 12:01:36 - http - Lifer -  - id=aa
02\/11\/2014 12:01:37 +0200 - SUCCESS - gmail.com  - 12:01:37 - http - Lifer -  - id=bb
02\/11\/2014 12:11:36 +0200 - FAIL - gmail.com  - 12:11:36 - http - Lifer -  - id=bb
02\/11\/2014 12:21:39 +0200 - SUCCESS - gmail.com  - 12:21:39 - http - Lifer -  - id=cc
02\/11\/2014 12:51:45 +0200 - SUCCESS - gmail.com  - 12:51:45 - http - Lifer -  - id=dd
.........................................................................................
.........................................................................................
02\/11\/2014 14:01:37 +0200 - SUCCESS - gmail.com  - 14:01:37 - http - Lifer -  - id=bb
02\/11\/2014 14:11:37 +0200 - SUCCESS - gmail.com  - 14:11:37 - http - Lifer -  - id=cc
02\/11\/2014 14:31:37 +0200 - FAIL - gmail.com  - 14:31:37 - http - Lifer -  - id=bb

Open in new window


I want to count the number of occurrences with string "SUCCESS" happening from 02\/11\/2014 12:00 to 02\/11\/2014 12:30 only

Have someone any idea How do I grep this file and get this number?

Thanks in advance!
0
Comment
Question by:ralph_rea
  • 9
  • 6
  • 2
17 Comments
 
LVL 31

Expert Comment

by:farzanj
Comment Utility
Try
grep -P '02\/11\/2014 (?:12:[0-2]\d|30).*SUCCESS' filename

Open in new window

0
 
LVL 21

Assisted Solution

by:Mazdajai
Mazdajai earned 100 total points
Comment Utility
Try

grep -iPc '02\/11\/2014\s12:[0123]0.+SUCCESS' 

Open in new window

0
 
LVL 31

Expert Comment

by:farzanj
Comment Utility
Yes, to count, use option c

SO my answer is:

grep -Pc '02\/11\/2014 12:(?:[0-2]\d|30).*SUCCESS' filename

Open in new window

0
 

Author Comment

by:ralph_rea
Comment Utility
farzanj,
I get 0
but let's take a practical example, suppose that I want to count the number of occurrences with string "SUCCESS" and string "gmail.com" happening from 02\/11\/2014 11:00 to 02\/11\/2014 11:10 in the aud.log file:

my query is:

grep -Pc '02\/11\/2014 11:(?:[0-2]\d|10).*SUCCESS'|grep -Pc '02\/11\/2014 11:(?:[0-2]\d|10).*gmail.com aud.log

It's correct?
0
 
LVL 31

Expert Comment

by:farzanj
Comment Utility
No.  Because you don't need the second grep and the second grep is grepping on a number

Just even remove c, you can put is when you are happy with what it greps.


Just use the first grep statement and see the results.

FYI, I had tested it before pasting, and I got 3 results in the data that you have provided, so it works.
0
 
LVL 31

Expert Comment

by:farzanj
Comment Utility
You have changed my grep.  It is 30 not 10.  Just copy and paste and change the filename only.

If you ever need two greps, the filename goes with the first statement.  You don't need two greps right now.

What's the filename?
0
 

Author Comment

by:ralph_rea
Comment Utility
filename is aud.log
BUT I'd like to try to count the number of occurrences with string "SUCCESS" and string "gmail.com" from 11:00 AM to 11:10 AM  (02\/11\/2014).

In this case how it changes your grep?
0
 
LVL 31

Expert Comment

by:farzanj
Comment Utility
Ok, fair enough.

Try this:
EDITED:
grep -Pc '02\/11\/2014 11:(?:0\d|10).*SUCCESS.+gmail\.com' aud.log

Open in new window


Try it without c as well to see the actual results being counted.
0
Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

 
LVL 31

Expert Comment

by:farzanj
Comment Utility
Also, while testing, I had removed the backslashes from the log file itself.

So the log was like

02/11/2014 12:01:36 +0200 - SUCCESS - gmail.com  - 12:01:36 - http - Lifer -  - id=aa
02/11/2014 12:01:37 +0200 - SUCCESS - gmail.com  - 12:01:37 - http - Lifer -  - id=bb
02/11/2014 12:11:36 +0200 - FAIL - gmail.com  - 12:11:36 - http - Lifer -  - id=bb
02/11/2014 12:21:39 +0200 - SUCCESS - gmail.com  - 12:21:39 - http - Lifer -  - id=cc
02/11/2014 12:51:45 +0200 - SUCCESS - gmail.com  - 12:51:45 - http - Lifer -  - id=dd
.........................................................................................
.........................................................................................
02/11/2014 14:01:37 +0200 - SUCCESS - gmail.com  - 14:01:37 - http - Lifer -  - id=bb
02/11/2014 14:11:37 +0200 - SUCCESS - gmail.com  - 14:11:37 - http - Lifer -  - id=cc
02/11/2014 14:31:37 +0200 - FAIL - gmail.com  - 14:31:37 - http - Lifer -  - id=bb

Open in new window

0
 

Author Comment

by:ralph_rea
Comment Utility
farzanj,
Something wrong I get always 0

In attach my aud.log file:

grep -Pc '02\/11\/2014 12:(?:[0-2]\d|30).*SUCCESS' aud.log
0

Open in new window

even if I grep only data I get zero:
grep -Pc '02\/11\/2014'  aud.log
0

Open in new window


seems like the date format is incorrect

What I wrong?
aud.log
0
 
LVL 31

Accepted Solution

by:
farzanj earned 400 total points
Comment Utility
Try now:

grep -Pc '02\\\/11\\\/2014 11:(?:0\d|10).*SUCCESS.+gmail\.com' aud.log

Open in new window



Reason: You have dates
02\/11\/2014 in your logs instead of 02/11/2014
0
 

Author Comment

by:ralph_rea
Comment Utility
Ok,
below correct format:

grep -Pc '02\\/11\\/2014 12:(?:[0-2]\d|30).*SUCCESS'
0
 
LVL 31

Expert Comment

by:farzanj
Comment Utility
Just copy and paste my command above
0
 

Author Comment

by:ralph_rea
Comment Utility
why from 11:00 to 11:10 the format is:

11:(?:0\d|10)

while from 12:00 to 12:30 the format is:
12:(?:[0-2]\d|30)

what's the difference between ?:0  and ?:[0-2]

Thanks!
0
 
LVL 31

Expert Comment

by:farzanj
Comment Utility
It is regular expressions

(?:     and   )  pairs without matching

0\d  means 0 followed by any digit, so that takes care of 01-09
SO I am saying 01 through 09 OR 10

[0-2] means 0 or 1 or 2
So, I am saying first digit is 0 or 1 or 2 with second digit as anything, that should take care of 01 through 29

And then I am saying OR 30, because you don't want 31 or 32 ...
0
 

Author Comment

by:ralph_rea
Comment Utility
I do not know if I should ask another question, but I also need to complete the following grep on my log file:

I want to count the number of occurrences with string "SUCCESS" happening from:
 02\/11\/2014 12:00 to 02\/11\/2014 19:00

 02\/11\/2014 16:02 to end of the file

I am having difficulty writing these grep, can you help?

I can also open a new question.

Thanks!
0
 
LVL 21

Expert Comment

by:Mazdajai
Comment Utility
You should open a new question, this is a closed one.
0

Featured Post

Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

Join & Write a Comment

If you use Debian 6 Squeeze and you are tired of looking at the childish graphical GDM login screen that is used by default, here's an easy way to change it. If you've already tried to change it you've probably discovered that none of the old met…
In my business, I use the LTS (Long Term Support) versions of Linux. My workstations do real work, and so I rarely have the patience to deal with silly problems caused by an upgraded kernel that had experimental software on it to begin with from a r…
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now