Disk Cleanup - with Active files

There is a procedure that has stirred some debate amongst our team.

We agree here:  If a file is removed while allocated, the space is not freed.
   Example of large file statically allocated:
unix178:# fuser BIG.log
BIG.log:   454734
#
unix178:# ls -l BIG.log
-rw-r--r--    1 ivmgr    ivmgr    1255292755 Sep 11 10:15 BIG.log
#
unix178:# df -Pm .
Filesystem    MB blocks      Used Available Capacity Mounted on
/dev/hd9var     1536.00   1372.28    163.72      90% /var

Open in new window


If BIG.log is removed,  the inode is still allocated, and thus, no disk space is freed, until the process ends, or closes it's files.

Here's the question:  
    We have two methods discussed, to free the disk space.  
    Group 1 does not like touching "active" files with a redirect(>),  
    Group 2 says "it appears to work fine".

Here's the steps for the two methods.  
Method 1:
     mv BIG.log  BIG.log-YYMMDD.HHMMSS
     touch BIG.log             #   (optionally chmod/chown if required to make match original)
     Recycle application  
     mv BIG.log-YYMMDD.HHMMSS   /archive
### the Filesystem space is truly freed.

Method 2:
    cat BIG.log > /archive/BIG.log-YYMMDD.HHMMSS
    > BIG.log          
### The Filesystem space does appear to be freed.

Assuming the risk of loosing a couple of log records using method 2 (that may have been created between the end of the cat command, and the redirect) is acceptable

    Pro for Method 1:  * absolutely no records are lost
                              * No Risk of orphaned inodes (if steps followed properly)
    Con for method 1:  * Causes application outage
                               * More commands, risk of a new admin orphaning an inode.

Is there any reason to avoid method 2?

Is there a risk of "confusing" a program that has a file allocated, by issuing a redirect to it's file?

Thanks,
Tom



LVL 6
TomuniqueAsked:
Who is Participating?
 
woolmilkporcCommented:
Hi,

if you can afford recycling an application but can't afford losing log data methosd 1 is the one to choose.

But your command sequence seems a bit strange. Why not:

- stop application
cp -p BIG.log  /archive/BIG.log-YYMMDD.HHMMSS
>BIG.log
- start application

I used "cp -p" and ">" instead of "mv" and "touch" to preserve the permissions and timestamps (source and destination).
I prepended /archive to the target of the "cp" step to avoild the final "mv".

Or are you using "mv" to the current dir to make the whole thing as fast as possible? If speed is really a concern you should keep your "mv" version! But splitting "recycle" into "start" and "stop" still seems more practical to me.


The second method has to be chosen if restarting an application is not an option, but losing log data can be tolerated.

Instead of "cat" you can also use "cp -p" here, to preserve permissions and timestamps on the target.

cp -p BIG.log > /archive/BIG.log-YYMMDD.HHMMSS
>BIG.log

And No, the writing program will not be confused by your emptying of the target file.
I also don't see a risk of orphaning inodes, as long as you don't use "rm" anywhere.

wmp
0
 
woolmilkporcCommented:
... the ">" in the last "cp" step is of course wrong (copy-and-paste error)! Sorry!
0
 
nociSoftware EngineerCommented:
It depends....

- Do logs get used regularly?
- Is your business regulated? [Banking, government...].
In that case you DON't want to loose records.


- If you're hobyist,
- never (realy never) look at logs..
- the software cannot suffer any down time
Who cares.

If you cannot suffer downtime then you obviousely have a bid UPS standing by, with longer term solutions to keep running too...

There is an other solution, use some tool to read from pipelines and create syslog records,
send the syslog records to something like metalog, or syslog-ng  those loggers can change files at intervals, so you have best of all worlds...
0
Cloud Class® Course: Python 3 Fundamentals

This course will teach participants about installing and configuring Python, syntax, importing, statements, types, strings, booleans, files, lists, tuples, comprehensions, functions, and classes.

 
woolmilkporcCommented:
... or you can use logrotate, which is available for AIX here:

http://www.perzl.org/aix/index.php?n=Main.Logrotate

Install it using the rpm tool which comes with AIX.

There are some dependencies. All required packages are available at the above site http://www.perzl.org/aix/index.php?n=Main.HomePage
as well.

Please note that logrotate can be set up to use either of the methods you mentioned, it's just a matter of configuration.

wmp
0
 
TomuniqueAuthor Commented:
We're not regulated, but certainly not a hobby... (fortune 500).  The methodology used would have to be selected on a case by case basis, those that can't loose any records, clearly fall to method 1.

It's more concept than specifics.  We have annoying offending apps, but some will be resolved with Version upgrades.  My concern was around method 2, in what appears to be working, could present itself as a nightmare if not used properly (orphaned inodes is our biggest problem).

WMP:
   The reason we do the mv/recycle, is there's two teams involved in the process.  Admin team, and application team.  Historically, we assist managing the logs, as app team isn't unix literate.  But we need them to recycle the app.  So, we prep the environment, ask them to bounce the app.  when we see the logs are free via fuser, we move the off to archive.

If using redirect (>) to truncate a file that another process is writing to is generally acceptable, then I'd say that would make the most sense, and the admin team can put a step in the disk management monitors to do this without the assistance of the app teams.




0
 
woolmilkporcCommented:
Yes, redirect will never do any harm. I (and my logrotate) use it all the time without any issues.

By the way, logrotate is able to recycle applications (i.e. issue arbitrary commands during the rotation process).
Maybe this feature could get you rid of the need to involve your application team, should you decide to use method 1 anyway?

If switching to a particular userid is required for this, "su -" is an option, since logrotate runs with root credentials:

/path/to/BIG.log {
rotate 5
size=100M
olddir /archive
create 544 appl_user appl_group
postrotate
   su - appl_user -c "/command/to/recycle/application"
endscript
}

Or, with a split "recycle":

/path/to/BIG.log {
rotate 5
size=100M
olddir /archive
create 544 appl_user appl_group
prerotate
   su - appl_user -c "/command/to/stop/application"
endscript
postrotate
   su - appl_user -c "/command/to/start/application"
endscript
}


Method 2:

/path/to/BIG.log {
rotate 5
size=100M
olddir /archive
copytruncate
}

Of course, when you say that there are disk monitors in place and that there is some scripting skill - no need for logrotate.

wmp
0
 
nociSoftware EngineerCommented:
And finaly be sure that a specification gets accepted to build rotation into any newly developped & bought application.
0
 
TomuniqueAuthor Commented:
Thanks guys...  appreciate the input.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.