Solved

Pattern matching question

Posted on 2013-05-20
14
286 Views
Last Modified: 2013-05-24
I have the following files in the logs, there are other dates as well, but i am interested in files from may 17-19 ie log.[33-104]

-rw-r--r--  1 nobody wheel   4261062 May 19 23:00 error.log.33.gz
-rw-r--r--  1 nobody wheel   3893695 May 19 22:00 error.log.34.gz
-rw-r--r--  1 nobody wheel   3626056 May 19 21:00 error.log.35.gz
-rw-r--r--  1 nobody wheel   3524895 May 19 20:00 error.log.36.gz
-rw-r--r--  1 nobody wheel   3524926 May 19 19:00 error.log.37.gz
-rw-r--r--  1 nobody wheel   3655532 May 19 18:00 error.log.38.gz
-rw-r--r--  1 nobody wheel   3894267 May 19 17:00 error.log.39.gz
-rw-r--r--  1 nobody wheel   4283616 May 19 16:00 error.log.40.gz
-rw-r--r--  1 nobody wheel   4841395 May 19 15:00 error.log.41.gz
-rw-r--r--  1 nobody wheel   5451411 May 19 14:00 error.log.42.gz
-rw-r--r--  1 nobody wheel   5874924 May 19 13:00 error.log.43.gz
-rw-r--r--  1 nobody wheel   6093528 May 19 12:00 error.log.44.gz
-rw-r--r--  1 nobody wheel   5985140 May 19 11:00 error.log.45.gz
-rw-r--r--  1 nobody wheel   5925186 May 19 10:00 error.log.46.gz
-rw-r--r--  1 nobody wheel   5764293 May 19 09:00 error.log.47.gz
-rw-r--r--  1 nobody wheel   5564082 May 19 08:00 error.log.48.gz
-rw-r--r--  1 nobody wheel   5408408 May 19 07:00 error.log.49.gz
-rw-r--r--  1 nobody wheel   5299642 May 19 06:00 error.log.50.gz
-rw-r--r--  1 nobody wheel   5238462 May 19 05:00 error.log.51.gz
-rw-r--r--  1 nobody wheel   5142348 May 19 04:00 error.log.52.gz
-rw-r--r--  1 nobody wheel   5035720 May 19 03:00 error.log.53.gz
-rw-r--r--  1 nobody wheel   4925860 May 19 02:00 error.log.54.gz
-rw-r--r--  1 nobody wheel   4693739 May 19 01:00 error.log.55.gz
-rw-r--r--  1 nobody wheel   4409631 May 19 00:00 error.log.56.gz
-rw-r--r--  1 nobody wheel   4041494 May 18 23:00 error.log.57.gz
-rw-r--r--  1 nobody wheel   3661199 May 18 22:00 error.log.58.gz
-rw-r--r--  1 nobody wheel   3458227 May 18 21:00 error.log.59.gz
-rw-r--r--  1 nobody wheel   3422274 May 18 20:00 error.log.60.gz
-rw-r--r--  1 nobody wheel   3482197 May 18 19:00 error.log.61.gz
-rw-r--r--  1 nobody wheel   3642665 May 18 18:00 error.log.62.gz
-rw-r--r--  1 nobody wheel   3888597 May 18 17:00 error.log.63.gz
-rw-r--r--  1 nobody wheel   4291573 May 18 16:00 error.log.64.gz
-rw-r--r--  1 nobody wheel   4715398 May 18 15:00 error.log.65.gz
-rw-r--r--  1 nobody wheel   5092642 May 18 14:00 error.log.66.gz
-rw-r--r--  1 nobody wheel   5297790 May 18 13:00 error.log.67.gz
-rw-r--r--  1 nobody wheel   5417420 May 18 12:00 error.log.68.gz
-rw-r--r--  1 nobody wheel   5414351 May 18 11:00 error.log.69.gz
-rw-r--r--  1 nobody wheel   5466902 May 18 10:00 error.log.70.gz
-rw-r--r--  1 nobody wheel   5478482 May 18 09:00 error.log.71.gz
-rw-r--r--  1 nobody wheel   5339512 May 18 08:00 error.log.72.gz
-rw-r--r--  1 nobody wheel   5234666 May 18 07:00 error.log.73.gz
-rw-r--r--  1 nobody wheel   5125385 May 18 06:00 error.log.74.gz
-rw-r--r--  1 nobody wheel   5022500 May 18 05:00 error.log.75.gz
-rw-r--r--  1 nobody wheel   5038367 May 18 04:00 error.log.76.gz
-rw-r--r--  1 nobody wheel   4931596 May 18 03:00 error.log.77.gz
-rw-r--r--  1 nobody wheel   5380413 May 18 02:00 error.log.78.gz
-rw-r--r--  1 nobody wheel   6081307 May 18 01:00 error.log.79.gz
-rw-r--r--  1 nobody wheel   5851468 May 18 00:00 error.log.80.gz
-rw-r--r--  1 nobody wheel   5253538 May 17 23:00 error.log.81.gz
-rw-r--r--  1 nobody wheel   4708359 May 17 22:00 error.log.82.gz
-rw-r--r--  1 nobody wheel   4389486 May 17 21:00 error.log.83.gz
-rw-r--r--  1 nobody wheel   4318955 May 17 20:00 error.log.84.gz
-rw-r--r--  1 nobody wheel   4369393 May 17 19:00 error.log.85.gz
-rw-r--r--  1 nobody wheel   4620540 May 17 18:00 error.log.86.gz
-rw-r--r--  1 nobody wheel   4999584 May 17 17:00 error.log.87.gz
-rw-r--r--  1 nobody wheel   5643021 May 17 16:00 error.log.88.gz
-rw-r--r--  1 nobody wheel   6418608 May 17 15:00 error.log.89.gz
-rw-r--r--  1 nobody wheel   7190572 May 17 14:00 error.log.90.gz
-rw-r--r--  1 nobody wheel   7741545 May 17 13:00 error.log.91.gz
-rw-r--r--  1 nobody wheel   7919648 May 17 12:00 error.log.92.gz
-rw-r--r--  1 nobody wheel   8108684 May 17 11:00 error.log.93.gz
-rw-r--r--  1 nobody wheel   8030492 May 17 10:00 error.log.94.gz
-rw-r--r--  1 nobody wheel   7982610 May 17 09:00 error.log.95.gz
-rw-r--r--  1 nobody wheel   7854057 May 17 08:00 error.log.96.gz
-rw-r--r--  1 nobody wheel   7594561 May 17 07:00 error.log.97.gz
-rw-r--r--  1 nobody wheel   7493339 May 17 06:00 error.log.98.gz
-rw-r--r--  1 nobody wheel   7362408 May 17 05:00 error.log.99.gz
-rw-r--r--  1 nobody wheel   7403433 May 17 04:00 error.log.100.gz
-rw-r--r--  1 nobody wheel   7232114 May 17 03:00 error.log.101.gz
-rw-r--r--  1 nobody wheel   7060528 May 17 02:00 error.log.102.gz
-rw-r--r--  1 nobody wheel   6657271 May 17 01:00 error.log.103.gz
-rw-r--r--  1 nobody wheel   6163487 May 17 00:00 error.log.104.gz

In these files, there are lines like

i am running  "sudo zgrep grep '#C' *"

2013/05/21 06:04:09 [info] 5336#0: *141932476 client login failed: "[UNAVAILABLE] Service UNAVAILABLE; please try again.(#C28)" while in http auth state, client: 49.999.86.75, server: 27.193.196.35:993, login: "edfg@gmail.com"^M
2013/05/21 06:04:12 [info] 5338#0: *141932859 client login failed: "[UNAVAILABLE] Service UNAVAILABLE; please try again.(#C6)" while in http auth state, client: 202.90.105.118, server: 124.10.96.239:993, login: "abcd@man.com"^M

What i want to do is extract only lines that have #c6 and #c7  in them and extract the login info for them, so in the above example,    get unique logins and their yids


abcd@man.com (no of times it was repeated in the logs)

Is there an easy way to do this? This is really urgent.
0
Comment
Question by:Vlearns
  • 8
  • 4
  • 2
14 Comments
 

Author Comment

by:Vlearns
Comment Utility
how do i run a zgrep on those set of files by date or by indexes and extract unique logins for the users that have c6 or c7

thanks
0
 
LVL 84

Assisted Solution

by:ozo
ozo earned 150 total points
Comment Utility
sudo zgrep  '#C[67]' * | perl -ne '++$n{(/login: "(.*?)"/)[0]}; END{ print "$n{$_}\t$_\n" for sort keys %n}'
0
 

Author Comment

by:Vlearns
Comment Utility
sudo find error* -type f -newer error.log.104.gz  -not -newer error.log.33.gz

this gives me the list of files in the range..
0
 

Author Comment

by:Vlearns
Comment Utility
Hi Ozo

Thanks for your help! Much appreciated

sudo zgrep  '#C[67]' * | perl -ne '++$n{(/login: "(.*?)"/)[0]}; END{ print "$n{$_}\t$_\n" for sort keys %n}

This command looks for C6 or C7 in the code right? it does not look for the files sorted by date right...this command would have to be run on a subset of files right? i want to run this on the dates 17th may -19th may
0
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 350 total points
Comment Utility
sudo zgrep -E '#C6|#C7' $(sudo find error* -type f -newer error.log.104.gz  -not -newer error.log.33.gz) |awk -F'"' '{print $4}' | sort | uniq -c |awk '{print $2, "(" $1 ")"}'
0
 

Author Comment

by:Vlearns
Comment Utility
Thanks woolmilkpro

can you explain this statement a bit

 sudo zgrep -E '#C6|#C7' $(sudo find error* -type f -newer error.log.104.gz  -not -newer error.log.33.gz) |awk -F'"' '{print $4}' | sort | uniq -c
0
 

Author Comment

by:Vlearns
Comment Utility
can i do this based on dates of the file instead of file names
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
We use zgrep with the "-E" flag so it can understand the regular expression   '#C6|#C7'  which means "'C6 or #C7".

We run this zgrep against the list of files resulting from the command you posted, using the command substitution notation "$( ... )"

The search results are then piped to awk. We use the field delimiter " (assuming that the logfiles records are all of the same format) to extract the 4th field, which is the login.

The resulting strings get sorted, made unique and counted ("sort", "uniq -c").

Since the output format of "uniq -c" is "number string" we pipe the results againt to awk to intercange fields 1 and 2 and to put field 1 (now field 2, the count) in parentheses.
0
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
As for the file dates - yes, instead of using existing files as a reference for "find" you can create your own reference files by means of "touch".

touch -t 201305170000 /tmp/from
touch -t 201305192359 /tmp/to

sudo zgrep -E '#C6|#C7' $(sudo find error* -type f -newer /tmp/from  -not -newer /tmp/to) |awk -F'"' '{print $4}' | sort | uniq -c |awk '{print $2, "(" $1 ")"}'

rm /tmp/from /tmp/to
0
 

Author Comment

by:Vlearns
Comment Utility
Hi woolmilkporc

2013/05/21 07:31:18 [info] 5324#0: *142547634 client logged in, client: 49.186.67.66, server: 124.118.66.239:993, login: "yusufakay2000@gmail.com", upstream: 18.239.211.119:5019

I ran the same script on the same set of files, i was not able to print any logins...
the first set of users are failures c6/7, this line tracks successes
i need to print failures to total successes
0
 

Author Comment

by:Vlearns
Comment Utility
sudo zgrep -E 'client logged in' $(sudo find error* -type f -newer error.log.104.gz  -not -newer error.log.33.gz) |awk -F'"' '{print $2}' | sort | uniq -c
0
 
LVL 84

Expert Comment

by:ozo
Comment Utility
i need to print failures to total successes
Can you show examples of lines with failure, lines with success, and the desired result in that case?
0
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
@Vlearns: Your last post contains the solution to counting and displaying successes.
Do you need more assistance?

Perhaps sorting?

( sudo zgrep -E '#C6|#C7' $(sudo find error* -type f -newer /tmp/from  -not -newer /tmp/to) |awk -F'"' '{print $4}' | sort | uniq -c |awk '{print $2, "FAILURE", "(" $1 ")"}'
sudo zgrep -E 'client logged in' $(sudo find error* -type f -newer /tmp/from  -not -newer /tmp/to) |awk -F'"' '{print $2}' | sort | uniq -c |awk '{print $2,  "SUCCESS", "(" $1 ")"}'  ) | sort
0
 

Author Comment

by:Vlearns
Comment Utility
Thanks Much!
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Setting up Secure Ubuntu server on VMware 1.      Insert the Ubuntu Server distribution CD or attach the ISO of the CD which is in the “Datastore”. Note that it is important to install the x64 edition on servers, not the X86 editions. 2.      Power on th…
Recently, an awarded photographer, Selina De Maeyer (http://www.selinademaeyer.com/), completed a photo shoot of a beautiful event (http://www.sintjacobantwerpen.be/verslag-en-fotoreportage-van-de-sacramentsprocessie-door-antwerpen#thumbnails) in An…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now