Link to home
Start Free TrialLog in
Avatar of texasreddog
texasreddog

asked on

shell script search for one or more whitespace regular expression

I have a shell script where I run du -sh on a directory, and if I want to find all Mb files, I run this line in my script:

Mb)
      du -sh $DIR/* 2>/dev/null|grep -v [0-9][Mm]|sort -nr
      ;;
 
The problem I have with this line is that if there's files with other digits and M in the name, it matches those too, like this:

146M    /tmp/file4ZURF4
117M    /tmp/fileoVe4mF
85M     /tmp/filezN0eoA
84K     /tmp/fileT8mZQr.csv
84K     /tmp/file4MbmT3.csv

So I want to match the whitespace that occurs after the M in 146M.  But I can't find the exact regular expression to do this in a shell script.  Can someone pass this along?  Thanks!
Avatar of nasirbest
nasirbest
Flag of Pakistan image

try this
      du -sh $DIR/* 2>/dev/null|grep -v [0-9][Mm][\s]|sort -nr
Combination '<digit>M<space>' may be met in the name of file, too...

Need to find all strings, which begin from digits. It is symbol '^' in regexp.
Also, there may be more then one digit and decimal point. We need to use symbol '*'.
Besides that, option '-v' (or --invert-match) inverts the sense of matching, to select non-matching lines (according to man grep).
So, the final expression:

du -sh $DIR/* | grep ^[.0-9]*[Mm] | sort -nr

'.' in expression '[.0-9]' means decimal point, may vary in different locale.
ASKER CERTIFIED SOLUTION
Avatar of woolmilkporc
woolmilkporc
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial