Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 791
  • Last Modified:

A script to clear log file

Let's say that I have a Java program running, with its standard output directed to a log file:

java MyJavaProgram > stdout.log

The log file, stdout.log, can grow quite big, so I need a script to periodically archive it into a zip file, and then clear the log:

eval `date "+day=%d; month=%m; year=%Y"`
typeset -Z2 day month
typeset -Z4 year
CURRENT_DATE="${year}${month}${day}"
zip stdout.${CURRENT_DATE}.zip stdout.log
echo "Previous contents archived on `date`" > stdout.log

The 2nd last line of the script zips up the file, and the last line clears the log by writing some message into it.

Is this a good way to do so? When I'm writing the message into the log file in the last line, how can I be sure that the Java program is also not writing at the same time? When I've zipped up the file after the 2nd last line, how can I be sure that the log file is not being updated by the Java program before the last line of the script is executed?

Any better suggestion is welcome. Thanks.
0
integrosys
Asked:
integrosys
  • 10
  • 9
  • 4
  • +1
2 Solutions
 
jhartzenCommented:
As a general rule, Many Unix and Linux variants have this possible problem which you are describing - (it is sometimes called a race-condition).  In you case, I would say it is worse because the zip could take quite a while to complete.  I would modify the script slightly like follow:

...
Do stuff to make new file name
...
cp stdout.log stdout.{CURRENT_DATE}.log
echo "Previous contents archived at `date`" > stdout.log
gzip stdout.{CURRENT_DATE}.log

You will notice that I copy the log file to a new log file with a date-stamp in its name.  This is so that when someone unzips the file, it will not overwrite any existing log files.

Also the copy operation will be quicker than the zip operation, so the time during which you could possibly lose messages is shorter, but in theory it still exists.  The trade-off here is that the copy will use more disk space, at least temporarily.

Also gzip will generally get better compression (Available with Solaris 9 and later by default, it is available from sunfreeware.com for earlier versions of Solaris)

My advice: Don't worry about losing messages, unless the copy takes *very* long.  In practice, it will take just a few seconds.

Hope this helps!
  _J
0
 
integrosysAuthor Commented:
Thanks. What about the race condition when I am doing this:

echo "Previous contents archived at `date`" > stdout.log

The Java program could be writing something to the file while at the same time I am writing my message to the file as well.
0
 
jhartzenCommented:
The same condition is true: You can possibly lose data, but the time is short, so the likeliness is small.  You need to evaluate the risk vs the effort to develop some better solution.  Generally we say the risk is "acceptable".

There are several possible options.

1. Re-write the Java program to perform log management internally.
2. Write a log capture program, which captures the log on stdin, and writes it to log files which it manages cleanly.
3. Pause the java program (pkill -STOP MyJavaProgram), perform log management, resume program (pkill -CONT MyJavaProgram).  Obviously this has got the disadvantage that MyJavaProgram will hang while you switch the log file.
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
TintinCommented:
Why are you doing

eval `date "+day=%d; month=%m; year=%Y"`
typeset -Z2 day month
typeset -Z4 year
CURRENT_DATE="${year}${month}${day}"

instead of just

CURRENT_DATE=`date +%Y%m%d`

0
 
integrosysAuthor Commented:
Actually it is not just a normal Java program. It is the WebLogic managed server instance. When we run it, all its own debug statements goes to the screen. We redirect it to a file instead. For our web application that runs in the server, we do have proper log management. The file for the WebLogic debug statements will just keep growing, so we need some log management. The content of this file is not as important as the log files for our application.

>> CURRENT_DATE=`date +%Y%m%d`

My code was copied from somewhere. I think it's a solution from someone in EE, but I can't remember now. Anyway, thanks for pointing out!
0
 
NukfrorCommented:
Ummmmm, why all the extra work that's in this:

...
Do stuff to make new file name
...
cp stdout.log stdout.{CURRENT_DATE}.log
echo "Previous contents archived at `date`" > stdout.log
gzip stdout.{CURRENT_DATE}.log

If you "do stuff to make new file name", why to do the copy ?  You've got a new file name, right, so the application is writing to a new log file so why "copy" the old one before you compress it ?  Course how you can change the name of a STDOUT redirected filename without stopping the application is magic to me ... call me crazy ....

A method that works wells with syslog is:

mv stdout.log stdout.{CURRENT_DATE}.log
<signal the application to start a new log file>
[compress|gzip|bzip2|whatever] stdout.{CURRENT_DATE}.log

Unfortunately the method above doesn't immediately work well with STDOUT based log file as you can't force a running application to easily close it STDOUT file descriptor and then reopen it with a log file name that's 100% out of its control.  You have to write the application such that it opens STDOUT (or whatever) as a physical file, you create a signal handler to watches for say SIGUSR1, when it sees that signal it closes the existing STDOUT file descriptor and open it again to the same file name.  With STDOUT redirected log files, you almost can't force a new name cuz the application writing the output has no control over the filename you the administrator told the shell to redirect the output into.  You can control the log filename with STDOUT redirected log files if you stop the application, roll the log, and then restart the application.

Assuming you can't stop the application, you have to hope that the following works all the time everytime:

<start application 10 hours ago or whatever in some other script>
....
<in my log rotation script>
cp stdout.log stdout.{CURRENT_DATE}.log
cp /dev/null stdout.log
echo "Previous contents archived at `date`" > stdout.log
gzip stdout.{CURRENT_DATE}.log

The problem is STDOUT offset pointer.  If that doesn't reset to zero when you do the "cp /dev/null ....", the original contents of the log file will magically reappear.
0
 
integrosysAuthor Commented:
cp /dev/null stdout.log
echo "Previous contents archived at `date`" > stdout.log

Does the extra "cp /dev/null stdout.log" make any difference?
0
 
jhartzenCommented:
No.
0
 
jhartzenCommented:
Or rather: Yes:  It is extra wasted work.
0
 
TintinCommented:
To clear a file, the quickest way is

>file

or

mv file /dev/null

0
 
NukfrorCommented:
mv file /dev/null ??!!??  NOT !!!!

If run as a non-root user, you'll get a permission error.  If you run that as root, the stability of you system will be at risk - DON'T DO THAT - use ">file" or "cp /dev/null file".

Yep, ">file" or "cp /dev/null file" are the quickest ways - but they don't always work how you expect depending on how the log file is being written to.  See my previous post.
0
 
TintinCommented:
Sorry, my brain was on holiday.

I meant

cp /dev/null file

0
 
jhartzenCommented:
The point is if you are about to write to a file, ie using
echo "some text" > file
this will trunkate the file and write the text in one command.  Doing extra work to first trunkate the file is wasted, irrespective of how efficient it is.  Worse, you have just added some extra event whereby messages to the log from the Java-program could be overwritten, eg between the truncate and the echo command is executed.

The only real solution here is to write a program which will manage the log files.  A script could do it, eg:
#!/bin/bash
# This script is a "stdin_log_handler"
# Courtesy J Hartzenberg
# Usage: Pipe logging output into this file.  Specify printf mask for log name on command-line, containing
#     five times the "%s" substitutor, for YEAR, MONTH, DAY, HOUR and MINUTE, eg
#
#   mYJavaProg | stdin_log_handler /var/log/myJavaProg.%s-%s-%s-%s:%s.log
#
# Disclaimer A: This script is in serious need of error condition handling
# Disclaimer B: This script was tested on linux (hence using bash, It should with with ksh too)

# Define max file size (Nr of 512-byte disk blocks) before switching to new file.  With slightly more advanced scripting, you can check the KB size, but this is neat.
MAXBLOCKS=50
TEMPLINK=/tmp/current_log
NewLog() {
      [ -f $TEMPLINK ] && rm -f $TEMPLINK
      LOGFILE=$(printf "$1\n" $( date "+%Y %m %d %H %M" )|head -1 )
      date "+New log file started at %C" > "$LOGFILE" && ln -s "$LOGFILE" $TEMPLINK || exit
}

NewLog "$1"

while read LINE
do
   echo "$LINE" >> $TEMPLINK
   SIZE=$( ls -sL $TEMPLINK | awk '{print $1}')
   [ $SIZE -gt $MAXBLOCKS ] && NewLog "$1"
done
0
 
jhartzenCommented:
I forgot about the compression of the previous log file.  This version is ammended to compress existing log files.

Also a handler is needed to allow multiple instances, eg for more than one program on the system.  Currently the softlink /tmp/current_log is the limitation, thes could just become /tmp/current_log.$$

Also one day when I have time I think I will investigate the posibility of using file handle re-assignment to simulate file opening / closing on the log file, and then the "echo $LINE" while be replaced by the command cat on its own, and the read LINE will be replaced by the command "true".  

#!/bin/sh

# This script is a "stdin_log_handler"
# Courtesy J Hartzenberg
# Usage: Pipe logging output into this file.  Specify printf mask for log name on command-line, containing five times %s, eg
#
#   mYJavaProg | stdin_log_handler /var/log/myJavaProg.%s-%s-%s-%s-%s.log
#
# Disclaimer A: This script is in serious need of error condition handling
# Disclaimer B: This script was tested on linux (hence using bash, It should with with ksh too)

# Define max file size (Nr of 512-byte disk blocks) before switching to new file.  With slightly more advanced scripting, you can check the KB size, but this is neat.
MAXBLOCKS=50
TEMPLINK=/tmp/current_log
NewLog() {

      if [ -f $TEMPLINK ]
      then
         OLDLOG=$(ls -l $TEMPLINK | sed 's/^.*-> //')
         gzip $OLDLOG && rm $TEMPLINK || exit 1
      fi
      LOGFILE=$(printf "$1\n" $( date "+%H %M %S" )|head -1 )
      date "+New log file started at %C" > "$LOGFILE" && ln -s "$LOGFILE" $TEMPLINK || exit 1
}

NewLog "$1"

while read LINE
do
   echo "$LINE" >> $TEMPLINK
   SIZE=$( ls -sL $TEMPLINK | awk '{print $1}')
   [ $SIZE -gt $MAXBLOCKS ] && NewLog "$1"
done
0
 
jhartzenCommented:
Errata: My apologies, in my testing I used only three parameters, HOUR, MINUTE, SECOND, and I copy-pasted that test script into this forum.  The line assigning the new LOGFILE name should read.

LOGFILE=$(printf "$1\n" $( date "+%Y %m %d %H %M" )|head -1 )

and the "%C" in the second date line should be a lower-case "%c"

It is way past my bedtime.
0
 
NukfrorCommented:
jhartzen, this is pretty sweet !  Nice :)
0
 
NukfrorCommented:
You just reminded me that I wrote a Perl program a long time ago for collecting Weblogic SNMP statistics to be used with Orca for graphing.  It "rolled" the data files at a specific time interval just like Orca did ... now you've got me planning to update it to also be able to roll the file based on the file size.

Also, not thumping your suggestion cuz it is a good one, your script wil have to fork/exec ls and awk for *every* line of text - not the mention all the associated system calls each of these binaries will produce.  So depending on how heavy the log traffic is the task of simply determining when to "roll" the file could become computationally VERY expensive.  You could get rid of the nawk fork/exec by using changing to script to ksh or bash and using the built in string handling capability to capture the output from "ls -sL" - which leaves just the ls command.

The nice thing about Perl is it has a stat call which in the end produces a system call in Solaris to collect the file metadata like file size.

Now you've really got me thinking.
0
 
jhartzenCommented:
Hi Nukfor,
Thank you for the compliment.  Last night after I wrote this script I was awake for a long time thinking of ways to improve it, exactly because of the inefficiency that you describe above!

One way would be to check the log file size only every 300 lines (or something like that).

Another would be to remove the check and add the following:

trap NewLog HUP

Then set up something in cron to check the file size every 10 minutes and if neccesary, send a HUP to the script.

The third method:  I have seen something in the past where there is something like a shell built-in which evaluates the size of a file.

For the second solution one can further improve the NewLog function with a mutex like this:
LOCKDIR=/var/locks/log_handler.$$
NewLog() {
while [ ! mkdir $LOCKDIR ]; do sleep 1; done
... do log switching here...
... Put gzip into background
rmdir $LOCKDIR
}

And, in all cases I would use at least the below to allow multiple simultaneous instances. TEMPLINK=/tmp/log_handler.current.$$

A perl script would be much more efficient, but I have still not gotten arround to learning perl yet!
0
 
NukfrorCommented:
Actually I was continuing to think about this last night myself.  For something like this Perl work work just fine if not a little heavy on memory.  KSH and bash could do this without any fork/exec traffic at all if you use the features of each shell.

Both ksh and bash can easily determine the size of a string with ${#VARIABLE_NAME}.  So you can watch the size of the file by reading in and keeping track of the size of the file like this (I haven't tested this):

#!/bin/ksh
## Gotta use ksh or bash for this
# Define max file size (in bytes) before switching to new file.
MAXBYTES=1000000
TEMPLINK=/tmp/current_log

NewLog() {
      [ -f $TEMPLINK ] && rm -f $TEMPLINK
      LOGFILE=$(printf "$1\n" $( date "+%Y %m %d %H %M" )|head -1 )
      date "+New log file started at %C" > "$LOGFILE" && ln -s "$LOGFILE" $TEMPLINK || exit
      LOGSIZE=0
      LOGLINES=0
}

while getopts "l:" OPT; do
        case $OPT in
                l  ) MAXLINES=$OPTARG;;
                \? ) print 'option usage error'
                     return 1
        esac
done
shift $(($OPTIND - 1))

NewLog "$1"

while read LINE
do
   echo "$LINE" >> $TEMPLINK
   # LOGSIZE - always add 1 to total for each line to account for EOL character.  This obviously only applies to
   # UNIX EOL environments.  Windows would be +2.
   LOGSIZE = $(( ${LOGSIZE} + ${#LINE} + 1 ))
   LOGLINES = $(( ${LOGLINES} + 1 ))
   if [ $MAXLINES ] {
      [ $LOGLINES -gt $MAXLINES ] && NewLog "$1"
   } else {
      [ $LOGSIZE -gt $MAXBYTES ] && NewLog "$1"
   }
done
0
 
NukfrorCommented:
And why I'm putting this much effort into a question worth 50 points I'll never know ...
0
 
NukfrorCommented:
Forgot to add a switch for changes MAXBYTES:

#!/bin/ksh
## Gotta use ksh or bash for this
# Define max file size (in bytes) before switching to new file.
MAXBYTES=1000000
TEMPLINK=/tmp/current_log

NewLog() {
      [ -f $TEMPLINK ] && rm -f $TEMPLINK
      LOGFILE=$(printf "$1\n" $( date "+%Y %m %d %H %M" )|head -1 )
      date "+New log file started at %C" > "$LOGFILE" && ln -s "$LOGFILE" $TEMPLINK || exit
      LOGSIZE=0
      LOGLINES=0
}

while getopts "l:" OPT; do
        case $OPT in
                b ) MAXBYTES=$OPTARG;;
                l  ) MAXLINES=$OPTARG;;
                \? ) print 'option usage error'
                     return 1
        esac
done
shift $(($OPTIND - 1))

NewLog "$1"

while read LINE
do
   echo "$LINE" >> $TEMPLINK
   # LOGSIZE - always add 1 to total for each line to account for EOL character.  This obviously only applies to
   # UNIX EOL environments.  Windows would be +2.
   LOGSIZE = $(( ${LOGSIZE} + ${#LINE} + 1 ))
   LOGLINES = $(( ${LOGLINES} + 1 ))
   if [ $MAXLINES ] {
      [ $LOGLINES -gt $MAXLINES ] && NewLog "$1"
   } else {
      [ $LOGSIZE -gt $MAXBYTES ] && NewLog "$1"
   }
done
0
 
NukfrorCommented:
damn missed something else:

#!/bin/ksh
## Gotta use ksh or bash for this
# Define max file size (in bytes) before switching to new file.
MAXBYTES=1000000
TEMPLINK=/tmp/current_log

NewLog() {
      [ -f $TEMPLINK ] && rm -f $TEMPLINK
      LOGFILE=$(printf "$1\n" $( date "+%Y %m %d %H %M" )|head -1 )
      date "+New log file started at %C" > "$LOGFILE" && ln -s "$LOGFILE" $TEMPLINK || exit
      LOGSIZE=0
      LOGLINES=0
}

while getopts "bl:" OPT; do
        case $OPT in
                b ) MAXBYTES=$OPTARG;;
                l  ) MAXLINES=$OPTARG;;
                \? ) print 'option usage error'
                     return 1
        esac
done
shift $(($OPTIND - 1))

NewLog "$1"

while read LINE
do
   echo "$LINE" >> $TEMPLINK
   # LOGSIZE - always add 1 to total for each line to account for EOL character.  This obviously only applies to
   # UNIX EOL environments.  Windows would be +2.
   LOGSIZE = $(( ${LOGSIZE} + ${#LINE} + 1 ))
   LOGLINES = $(( ${LOGLINES} + 1 ))
   if [ $MAXLINES ] {
      [ $LOGLINES -gt $MAXLINES ] && NewLog "$1"
   } else {
      [ $LOGSIZE -gt $MAXBYTES ] && NewLog "$1"
   }
done
0
 
jhartzenCommented:
He he he.  This 50-point question could easily turn into something quite advanced.

I don't like my original printf/date contrsruct any more - I think it is unwieldy.  I'd like to change the function definition as follow (based on your last version)

#!/bin/ksh
## Gotta use ksh or bash for this
# Define max file size (in bytes) before switching to new file.
MAXBYTES=1000000
TEMPLINK=/tmp/current_log.$$

LOCKDIR=/var/locks/log_handler.$$
NewLog() {
      # Remember the real filename of the old (current) log file
      OLDLOG=$(ls -l $TEMPLINK | sed 's/^.*-> //')

      # Prevent multiple instances from entering this part of the script
      while [ ! mkdir $LOCKDIR ] ; do sleep 1; done
      [ -f $TEMPLINK ] && rm -f $TEMPLINK
      LOGFILE=$(date "+$1")
      date "+New log file started at %c" > "$LOGFILE" && ln -s "$LOGFILE" $TEMPLINK || exit
      rmdir $LOCKDIR

      # Reset counters
      LOGSIZE=0
      LOGLINES=0
      # Compress the old LOG file in the background
      gzip $OLDLOG &
}

MANUAL_LOGROTATING=0
while getopts "bl:" OPT; do
        case $OPT in
                b ) MAXBYTES=$OPTARG;;
                l  ) MAXLINES=$OPTARG;;
                m ) MANUAL_LOGROTATING=1
                \? ) print "$0 usage error
Usage:
$0 [ -b | -l | -m ] Logfilemask

Logfilemask is a string containing %H %M %S %Y %m %d etc.  This string will
be passed to "date" to build up a file name.
Note:
  $0 reads log lines from standard output, and writes the lines to a new file.  The
  file name is generated based on the mask.  See man date(1) for more info. On
  log switch, the previous log file will be compressed using gzip."
                     return 1
        esac
done
shift $(($OPTIND - 1))

# A Log file must be specified on the command line
LOGFILE="$*"
[ ${#LOGFILE} -gt 0 ] || exit 1

# Allow the user to schedule Log file cycling through CRON - LOG will be cycled when a HUP is sent to this script.
[ $MANUAL_LOGROTATING == 1 ] && trap NewLog HUP

NewLog "$LOGFILE"

while read LINE
do
   echo "$LINE" >> $TEMPLINK
   if [ $MANUAL_LOGROTATING == 0 ]
   then
      # LOGSIZE - always add 1 to total for each line to account for EOL character.  This obviously only applies to
      # UNIX EOL environments.  Windows would be +2.
      LOGSIZE = $(( ${LOGSIZE} + ${#LINE} + 1 ))
      LOGLINES = $(( ${LOGLINES} + 1 ))
      if [ $MAXLINES ] {
         [ $LOGLINES -gt $MAXLINES ] && NewLog "$LOGFILE"
      } else {
         [ $LOGSIZE -gt $MAXBYTES ] && NewLog "$LOGFILE"
      }
    fi
done
0
 
NukfrorCommented:
Change the while loop to this (I forgot to add colons to mark the options as expecting arguments and you forgot to add a switch indicator for the "m" switch you added":

<snip>
MANUAL_LOGROTATING=0
while getopts "b:l:m" OPT; do
        case $OPT in
                b ) MAXBYTES=$OPTARG;;
                l  ) MAXLINES=$OPTARG;;
                m ) MANUAL_LOGROTATING=1
                \? ) print "$0 usage error
<snip>
0
 
jhartzenCommented:
Serves us right for not testing :-)
0
 
integrosysAuthor Commented:
>> And why I'm putting this much effort into a question worth 50 points I'll never know ...

I am quite impress by the enthusiam shown by you guys. But I don't need something that complicated. I just need something simple, and it does not need to be perfect. I am not even going to try out your solutions in the log file management, because I have more important tasks to attend to. Sorry if that disappoints you! I would love to give everyone a lot of points, but my available points are now quite limited because I have already stopped the subscription. Whatever available points I have now would be saved for any question I need to ask in the future. But I will still increase the points to 100, and split it between jhartzen and Nukfror for your enthusiam and commitment to this question. Thank you guys, and Tintin, for your participations.
0

Featured Post

Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

  • 10
  • 9
  • 4
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now