We help IT Professionals succeed at work.

We've partnered with Certified Experts, Carl Webster and Richard Faulkner, to bring you two Citrix podcasts. Learn about 2020 trends and get answers to your biggest Citrix questions!Listen Now

x

bash log filter

jculkincys
jculkincys asked
on
Medium Priority
681 Views
Last Modified: 2012-08-14
Hello

I need help with a bash script that will concat 3 log files and bring back some usefull information

Here is a sample of the data that is in the files

2006-03-28 13:47:07,669: user login: test
2006-03-28 14:03:06,156: user timeout: johnsonr4
2006-03-28 14:03:06,314: user logout: johnsonr2
2006-03-28 14:10:53,206: user login: jonesg4
2006-03-28 14:10:57,817: user login: smithf3

Anything after the comma in the time is millisecond and can be ignored

I want to get a total number of log ins for a day and also get the total number of unique log ins for a day.

Any tips on how to do this  or get started would be appreciated.
The first thing I need to find out if how to import this data from .log files and how to go about manipulating it

Thanks
jculkincys

Comment
Watch Question

XoF

Commented:
Below a code snippet which does what you want. Just call the script with the date as first param and the filename to parse as second.
The sed expression is more complex as needed to meet your requirements and can be used for further investigation. Just drop the "| wc -l" at the end and you will get a complete listing for example.

<code>
#!/bin/sh
day=$1
logfile=$2

logins=`sed -n 's/^\('$day'\)[ ]*\([0-9]*:[0-9]*:[0-9]*\),[0-9]*: user login: \(.*\)/\1 \2 \3/p' $logfile | sort -k 3,3 | wc -l`
ulogins=`sed -n 's/^\('$day'\)[ ]*\([0-9]*:[0-9]*:[0-9]*\),[0-9]*: user login: \(.*\)/\1 \2 \3/p' $logfile | sort -k 3,3 -u | wc -l`

cat << EOF
Statistics for $day:
----------------------

Total logins:   $logins
Unique logins: $ulogins
EOF

</code>

HTH,
-XoF-
XoF

Commented:
oops. forgot the concatenation:

<code>
#!/bin/sh
day=$1
shift
logfiles=$@

logins=`sed -n 's/^\('$day'\)[ ]*\([0-9]*:[0-9]*:[0-9]*\),[0-9]*: user login: \(.*\)/\1 \2 \3/p' $logfiles | sort -k 3,3 | wc -l`
ulogins=`sed -n 's/^\('$day'\)[ ]*\([0-9]*:[0-9]*:[0-9]*\),[0-9]*: user login: \(.*\)/\1 \2 \3/p' $logfiles | sort -k 3,3 -u | wc -l`

cat << EOF
Statistics for $day:
----------------------

Total logins:   $logins
Unique logins: $ulogins
EOF

</code>

call the script as:
./script <day> <file1> <file2> ... <file n>

Author

Commented:
That looks like it should work

Could you explain it a little for me.

Sorry I am a little new to bash programming

A line by line description would be most helpful

Author

Commented:
I forgot to tell you that I would want the total number of timeouts for a day but I should be able to implement that once I understand your code.

how are the concatenated files stores? in a bash variable?

I was think that if we were dealing with very large logs - it might be better to concatenate all the longs into a temporary file and read from there - so we would not have to load so much into memory.

Your thoughts?

jculkincys
Commented:

#!/bin/sh
## the so-called she-bang, which defines the interpreter to be used

day=$1
# store the first argument into variable "day"

shift # delete the first argument from the arg-list
logfiles=$@ # store all remaining arguments into the variable "logfiles"

# run sed on each logfile specified. the logfiles will be processed one after one, so you don't have to fear large memory consumptions.
# concatenation of the files does not occur
logins=`sed -n 's/^\('$day'\)[ ]*\([0-9]*:[0-9]*:[0-9]*\),[0-9]*: user login: \(.*\)/\1 \2 \3/p' $logfiles | sort -k 3,3 | wc -l`
ulogins=`sed -n 's/^\('$day'\)[ ]*\([0-9]*:[0-9]*:[0-9]*\),[0-9]*: user login: \(.*\)/\1 \2 \3/p' $logfiles | sort -k 3,3 -u | wc -l`
timeouts=`sed -n 's/^\('$day'\)[ ]*\([0-9]*:[0-9]*:[0-9]*\),[0-9]*: user timeout: \(.*\)/\1 \2 \3/p' $logfiles | sort -k 3,3 | wc -l`
utimeouts=`sed -n 's/^\('$day'\)[ ]*\([0-9]*:[0-9]*:[0-9]*\),[0-9]*: user timeout: \(.*\)/\1 \2 \3/p' $logfiles | sort -k 3,3 -u | wc -l`

## explanation of sed:
# sed -n [...] file 1 file2 filen #load each line of each file into the buffer (one after one), do not output buffer contents (-n)
# 's/pattern/replacement/p'  #process buffer: replace each occurence of "pattern" with "replacement" (s/); print new buffer content (/p)
# if parts of "pattern" are enclosed with \( \), this part is later addressable as \1, \2, and so on
# EXAMPLE: s/\(15\) \(men and \)a \(bottle of rum\)/one \2 \1 \3/p
# will transform "15 men and a bottle of rum" to "one men and 15 bottle of rum" (OK, orthographical wrong, but a nice example, isn't it? ;-)
#
# | sort -k 3,3  # pass the output to "sort"; sorting shall occur on the columns 3 to 3 (so only on column 3)
# | wc -l # pass the output of sort to "wc", which does a word count. Simply count lines instead of words (-l)

# a so called here-document:
# write everything upto a line with a leeding marker ("EOF") to stdout
cat << EOF
Statistics for $day:
----------------------

Total logins:       $logins
Unique logins:     $ulogins
Total timeouts:   $timeouts
Unique timeouts: $utimeouts
EOF



HTH,

-XoF-

Not the solution you were looking for? Getting a personalized solution is easy.

Ask the Experts

Author

Commented:
I do appreciate the rum example

Just a few more questions (thanks you have been very helpful so far)

1.)  Currently I don't know exactly how many session logs will be present when I run this program - so it might work better if I build $logfiles a different way. I know there will be a session.log but there may also be session.log.1 , session.log.2 - There will never be more than 9 total session logs so we don't have to worry about session.log.10. After we combine the session logs - the result may be quite large >  50 megs. Knowing this, would you continue to load them the current way or create a sessionstemp.log. I am just trying to figure out how large the total size would have to be before you would consider a different method.

2.) How would sed's  "-u" parameter affect this operation? Would it run slower and take less memory.

3.) Can you give me a quick example of a line or two that was being piped into sort. All the \'s have me a little confused - but I get the general idea. I can't get a feeling for then \2\1\3

Thanks again

Author

Commented:
Also - any ideas for making it more robust would be helpful?

You don't have to go through the trouble of doing it. I just want to learn more about bash programming.
-error handling
-checking to see if the file executes from the "logs" folder - if its doesn't then I want to cd to there
etc etc
XoF

Commented:
> 1.)  Currently I don't know exactly how many session logs will be present when I run this program - so it might work better if I build $logfiles a different way

Once again:
The number of logfiles to be processed does _not_ matter in any way! Let's say, you call the script like that:
/usr/local/bin/loganalyzer.sh 2006-03-28 /var/log/session.log.?

Now the shift-operator will strip off the first argument ("2006-03-28"). $logfiles will now contain "/var/log/session.log.?".
sed itself now will process session.log.[0-9] line by line - it's really as simple as it looks.

> -u param

I don't know this param.

> 3.) quick example

Original data:
2006-03-28 13:47:07,669: user login: test
2006-03-28 14:03:06,156: user timeout: johnsonr4
2006-03-28 14:03:06,314: user logout: johnsonr2
2006-03-28 14:10:53,206: user login: jonesg4
2006-03-28 14:10:57,817: user login: smithf3

After sed-processing (with match for "user login"):
2006-03-28 13:47:07 test
2006-03-28 14:10:53 jonesg4
2006-03-28 14:10:57 smithf3

> error handling

Can be achieved by:
- return values:
<command> || rc=1
if [ $rc -eq 1 ];then ....;fi

- signal handling --> man trap



HTH,

-XoF-

Author

Commented:
Alrighty - sorry for my ignorance.

I think I know understand that the files never get completely loaded into memory because sed goes through them one line at a time.
XoF

Commented:
alright. Perhaps it would have been helpful for you to know, what the name "sed" does stand for: streaming editor ;-)
Access more of Experts Exchange with a free account
Thanks for using Experts Exchange.

Create a free account to continue.

Limited access with a free account allows you to:

  • View three pieces of content (articles, solutions, posts, and videos)
  • Ask the experts questions (counted toward content limit)
  • Customize your dashboard and profile

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.