Link to home
Start Free TrialLog in
Avatar of ralph_rea
ralph_rea

asked on

GREP count occurrences

Hi,
my O.S. is Linux and I have a log file (aud.log), each line in the log is like this (I've more 10000 lines):

02\/11\/2014 12:01:36 +0200 - SUCCESS - gmail.com  - 12:01:36 - http - Lifer -  - id=aa
02\/11\/2014 12:01:37 +0200 - SUCCESS - gmail.com  - 12:01:37 - http - Lifer -  - id=bb
02\/11\/2014 12:11:36 +0200 - FAIL - gmail.com  - 12:11:36 - http - Lifer -  - id=bb
02\/11\/2014 12:21:39 +0200 - SUCCESS - gmail.com  - 12:21:39 - http - Lifer -  - id=cc
02\/11\/2014 12:51:45 +0200 - SUCCESS - gmail.com  - 12:51:45 - http - Lifer -  - id=dd
.........................................................................................
.........................................................................................
02\/11\/2014 14:01:37 +0200 - SUCCESS - gmail.com  - 14:01:37 - http - Lifer -  - id=bb
02\/11\/2014 14:11:37 +0200 - SUCCESS - gmail.com  - 14:11:37 - http - Lifer -  - id=cc
02\/11\/2014 14:31:37 +0200 - FAIL - gmail.com  - 14:31:37 - http - Lifer -  - id=bb

Open in new window


I want to count the number of occurrences with string "SUCCESS" happening from 02\/11\/2014 12:00 to 02\/11\/2014 12:30 only

Have someone any idea How do I grep this file and get this number?

Thanks in advance!
Avatar of farzanj
farzanj
Flag of Canada image

Try
grep -P '02\/11\/2014 (?:12:[0-2]\d|30).*SUCCESS' filename

Open in new window

SOLUTION
Avatar of Mazdajai
Mazdajai
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Yes, to count, use option c

SO my answer is:

grep -Pc '02\/11\/2014 12:(?:[0-2]\d|30).*SUCCESS' filename

Open in new window

Avatar of ralph_rea
ralph_rea

ASKER

farzanj,
I get 0
but let's take a practical example, suppose that I want to count the number of occurrences with string "SUCCESS" and string "gmail.com" happening from 02\/11\/2014 11:00 to 02\/11\/2014 11:10 in the aud.log file:

my query is:

grep -Pc '02\/11\/2014 11:(?:[0-2]\d|10).*SUCCESS'|grep -Pc '02\/11\/2014 11:(?:[0-2]\d|10).*gmail.com aud.log

It's correct?
No.  Because you don't need the second grep and the second grep is grepping on a number

Just even remove c, you can put is when you are happy with what it greps.


Just use the first grep statement and see the results.

FYI, I had tested it before pasting, and I got 3 results in the data that you have provided, so it works.
You have changed my grep.  It is 30 not 10.  Just copy and paste and change the filename only.

If you ever need two greps, the filename goes with the first statement.  You don't need two greps right now.

What's the filename?
filename is aud.log
BUT I'd like to try to count the number of occurrences with string "SUCCESS" and string "gmail.com" from 11:00 AM to 11:10 AM  (02\/11\/2014).

In this case how it changes your grep?
Ok, fair enough.

Try this:
EDITED:
grep -Pc '02\/11\/2014 11:(?:0\d|10).*SUCCESS.+gmail\.com' aud.log

Open in new window


Try it without c as well to see the actual results being counted.
Also, while testing, I had removed the backslashes from the log file itself.

So the log was like

02/11/2014 12:01:36 +0200 - SUCCESS - gmail.com  - 12:01:36 - http - Lifer -  - id=aa
02/11/2014 12:01:37 +0200 - SUCCESS - gmail.com  - 12:01:37 - http - Lifer -  - id=bb
02/11/2014 12:11:36 +0200 - FAIL - gmail.com  - 12:11:36 - http - Lifer -  - id=bb
02/11/2014 12:21:39 +0200 - SUCCESS - gmail.com  - 12:21:39 - http - Lifer -  - id=cc
02/11/2014 12:51:45 +0200 - SUCCESS - gmail.com  - 12:51:45 - http - Lifer -  - id=dd
.........................................................................................
.........................................................................................
02/11/2014 14:01:37 +0200 - SUCCESS - gmail.com  - 14:01:37 - http - Lifer -  - id=bb
02/11/2014 14:11:37 +0200 - SUCCESS - gmail.com  - 14:11:37 - http - Lifer -  - id=cc
02/11/2014 14:31:37 +0200 - FAIL - gmail.com  - 14:31:37 - http - Lifer -  - id=bb

Open in new window

farzanj,
Something wrong I get always 0

In attach my aud.log file:

grep -Pc '02\/11\/2014 12:(?:[0-2]\d|30).*SUCCESS' aud.log
0

Open in new window

even if I grep only data I get zero:
grep -Pc '02\/11\/2014'  aud.log
0

Open in new window


seems like the date format is incorrect

What I wrong?
aud.log
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Ok,
below correct format:

grep -Pc '02\\/11\\/2014 12:(?:[0-2]\d|30).*SUCCESS'
Just copy and paste my command above
why from 11:00 to 11:10 the format is:

11:(?:0\d|10)

while from 12:00 to 12:30 the format is:
12:(?:[0-2]\d|30)

what's the difference between ?:0  and ?:[0-2]

Thanks!
It is regular expressions

(?:     and   )  pairs without matching

0\d  means 0 followed by any digit, so that takes care of 01-09
SO I am saying 01 through 09 OR 10

[0-2] means 0 or 1 or 2
So, I am saying first digit is 0 or 1 or 2 with second digit as anything, that should take care of 01 through 29

And then I am saying OR 30, because you don't want 31 or 32 ...
I do not know if I should ask another question, but I also need to complete the following grep on my log file:

I want to count the number of occurrences with string "SUCCESS" happening from:
 02\/11\/2014 12:00 to 02\/11\/2014 19:00

 02\/11\/2014 16:02 to end of the file

I am having difficulty writing these grep, can you help?

I can also open a new question.

Thanks!
You should open a new question, this is a closed one.