Solved

Script or method to get "du . -s " with outputs for sizes above certain size & later than certain date

Posted on 2011-09-10
12
375 Views
Last Modified: 2012-05-12

When I'm doing housekeeping, issuing
 " du . -s "   or " du . " often give volumes
of outputs which I have to manually sieve
through.

Any Shell script/method for me to sieve out only
 those outputs of a certain size (which I can
 specify, say 200 kblocks) & later than certain
 date (say last 1 week) ?  Don't want any installed
 software/rpm.

I'm running on HP-Ux  B11.11 & RHES 4.x/5.x

0
Comment
Question by:sunhux
  • 7
  • 3
  • 2
12 Comments
 
LVL 21

Expert Comment

by:Papertrip
ID: 36518161
Find all files over 200k blocks:
find . -size +200000b

Open in new window

Find all files modified more than 7 days ago
find . -mtime +7

Open in new window

If you want to delete files that match either of those finds, do this:
find . -size +200000b -print0 |xargs -0 rm -f

Open in new window

or
find . -size +200000b -exec rm -f {} +

Open in new window


The difference between the 2 boils down to, for the most part, how to handle "weird" characters in filenames (spaces fall into that category too.)  This also depends on the versions of find and xargs, I suggest testing both safely.

Try each of those commands but with 'ls -l' instead of 'rm -f' to do your testing to see which, if not both, commands work for the files you are dealing with.
0
 
LVL 21

Expert Comment

by:Papertrip
ID: 36518171
Actually after looking at my syntax again, I see something that might hold you up --

add '-type f' into the find syntax to find files only, so that you don't accidentally delete directories that match your criteria which might have files underneath it that do not match it.  Aside from that, you would get an error from 'rm -f' for those matches anyways so the rm shouldn't rm those directories, but better to be safe than sorry.

0
 
LVL 21

Expert Comment

by:Papertrip
ID: 36518190
Ah yet again, some info that could prove useful to you.

You probably aren't going to be searching for files by blocks used (are you?).  Here are the other suffixes that can be used with '-size'... keep in mind the note at the end regarding if you are really going to search by block size.

 -size n[cwbkMG]
              File uses n units of space.  The following suffixes can be used:

              `b'    for 512-byte blocks (this is the default if no suffix is used)

              `c'    for bytes

              `w'    for two-byte words

              `k'    for Kilobytes (units of 1024 bytes)

              `M'    for Megabytes (units of 1048576 bytes)

              `G'    for Gigabytes (units of 1073741824 bytes)

              The  size  does  not count indirect blocks, but it does count blocks in sparse files that are not actually allocated.
              Bear in mind that the `%k' and `%b' format specifiers of -printf handle sparse files  differently.   The  `b'  suffix
              always denotes 512-byte blocks and never 1 Kilobyte blocks, which is different to the behaviour of -ls.

Open in new window

Now in regards to the time-based need you asked about, there are other options for that as well... atime, ctime, mtime.  Check out the 'find' man page for other modifiers, but mtime which stands for 'modified time' is probably the most common out of the bunch.

Long story short, now that you have an idea of how to do what you want, I recommend using the 'find' man page to fine tune it to your needs :)



0
 

Author Comment

by:sunhux
ID: 36518316

Can you combine both the criteria (& one more criteria) together in one single command :

list out files ( newer NOT older  than 7 days) of more than 100kBytes & text files only
(don't want binaries as they should be quite static)
0
 

Author Comment

by:sunhux
ID: 36518318


Would something like the following work?  Correct my syntax if it's wrong :

find . -mtime -7  -size +100000b -name *.log -exec rm -f {} +

0
 
LVL 21

Accepted Solution

by:
Papertrip earned 375 total points
ID: 36518322
Absolutely!

Test all the combos you want, just substitute something like 'ls -l' for 'rm -rf'.
0
Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

 
LVL 21

Assisted Solution

by:Papertrip
Papertrip earned 375 total points
ID: 36518324
Oh wait I misread your syntax... maybe that was a typo on your side?

find . -mtime -7

Open in new window

should be
find . -mtime +7

Open in new window

0
 
LVL 48

Assisted Solution

by:Tintin
Tintin earned 125 total points
ID: 36518555
You can't easily detect binaries, but -type f will match files only.  So your syntax should be:

find . -type f -mtime -7  -size +100000b -name "*.log" | xargs rm

Open in new window


Note it's more efficient to use xargs
0
 
LVL 21

Assisted Solution

by:Papertrip
Papertrip earned 375 total points
ID: 36518581
You can't easily detect binaries, but -type f will match files only
Apologies, I overlooked the comment that was posted about the binaries.
Note it's more efficient to use xargs
That is true if you are using an old or non-GNU version of find.  If you have a current GNU find, then the following operate the same way:
xargs rm

Open in new window

-exec rm {} +

Open in new window




0
 
LVL 21

Expert Comment

by:Papertrip
ID: 36518594
@Tintin
find . -type f -mtime -7  -size +100000b -name "*.log" | xargs rm

Open in new window

That syntax is fine (except for the mtime typo) if your files do not contain any characters that should be escaped.

The syntax I posted however will:
find . -size +200000b -print0 |xargs -0 rm -f

Open in new window


That syntax can of course be used in conjunction with the other arguments posted in this thread such as -type  and -name.
0
 
LVL 48

Expert Comment

by:Tintin
ID: 36518705
>mtime typo

What typo?


Note that HP/UX does not have GNU find.
0
 

Author Closing Comment

by:sunhux
ID: 36598706
ok
0

Featured Post

Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

Join & Write a Comment

Why Shell Scripting? Shell scripting is a powerful method of accessing UNIX systems and it is very flexible. Shell scripts are required when we want to execute a sequence of commands in Unix flavored operating systems. “Shell” is the command line i…
Recently, an awarded photographer, Selina De Maeyer (http://www.selinademaeyer.com/), completed a photo shoot of a beautiful event (http://www.sintjacobantwerpen.be/verslag-en-fotoreportage-van-de-sacramentsprocessie-door-antwerpen#thumbnails) in An…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now