We help IT Professionals succeed at work.

Could not extract file contents using awk and regex

My file is called /exports/tmp/ip789 and its content are pasted below:

# ipbackup
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=""
HOME=/var/lib/backup

14 00 * * * backup /usr/lib/backup --batch /var/lib/schedule/backup.SR2.1.0.13Mar12_113854.ini
14 11 * * * backup /usr/lib/backup --batch /var/lib/schedule/backup.SR2.1.0.13Mar12_114013.ini
*/2 * * * * /exports/tmp/gamer.sh
-----------------------------------------------------------------------
u can see that the file has many lines.
I get the following output when i execute " awk ' $0 ~ /^(14) /' /exports/tmp/ip789"

14 00 * * * backup /usr/lib/backup --batch /var/lib/schedule/backup.SR2.1.0.13Mar12_113854.ini
14 11 * * * backup /usr/lib/backup --batch /var/lib/schedule/backup.SR2.1.0.13Mar12_114013.ini


But the output when i execute  "awk ' $0 ~ /^(1) /' /exports/tmp/ip789"
is nothing....

any idea?
Comment
Watch Question

SILVER EXPERT
Most Valuable Expert 2013
Top Expert 2013
Commented:
Seems that in your second version you're searching for "1 " (i.e. a "1" followed by a space) which is obviously not present in your file (at least not at the start of a line).

"awk ' $0 ~ /^(1)/' /exports/tmp/ip789"

wmp

Author

Commented:
awk ' $0 ~ /^([0-9]{2,2}|[*]).*ini$/ ' /exports/tmp/ip789

is returnign nothing:


But it seems to be correct when i checked the regex in linux regex-editor
I need it to match teh pattern where the start of the line could be 2 digits or a * and th eline should end with $
SILVER EXPERT
Most Valuable Expert 2013
Top Expert 2013

Commented:
The latter statement should indeed work perfectly on the data you posted in your Q.

Could it be that there are whitespace characters following ".ini" ?

In this case the regex would fail.

Better this way then:

awk ' $0 ~ /^([0-9]{2,2}|[*]).*ini[ ]{0,}$/ ' ...

Author

Commented:
i opened the file in vi editor and set list to find that there is a $ at the end of ini.... so no space after that.

what would be your command to filter the lines which have decimal or * at the begining and end with .ini?
SILVER EXPERT
Most Valuable Expert 2013
Top Expert 2013

Commented:
Strange.

As I already said - your command is perfect for that kind of filtering, so I don't think I could give you a better one.

I just tested it here, and indeed, it works for me.

Author

Commented:
Still,,

do you have any other solution..
any help with grep or sed??
would be great help..
SILVER EXPERT
Most Valuable Expert 2013
Top Expert 2013
Commented:
"grep -E" or "egrep" work just the same way.

grep -E  "^([0-9]{2,2}|[*]).*ini$" /exports/tmp/ip789

With standard "grep" it's a bit more complicated:

grep -e "^[0-9]\{2,2\}" -e "^[*]"  /exports/tmp/ip789 | grep "ini$"

There is no real "and" in grep.

And with "sed" it's almost the same. The problem is always the "and" conjunction:

sed -n "/^[0-9]\{2,2\}/p;/^[*]/p"  /exports/tmp/ip789 | sed -n "/ini$/p"

All the above work (tested).

Author

Commented:
all the commands u sent are giving the same out put:
14 00 * * * backup /usr/lib/backup --batch /var/lib/schedule/backup.SR2.1.0.13Mar12_113854.ini
14 11 * * * backup /usr/lib/backup --batch /var/lib/schedule/backup.SR2.1.0.13Mar12_114013.ini

but does not show the the line:

*/2 * * * * /exports/tmp/gamer.sh

Author

Commented:
neglect my last post.

I need a a filter to get only teh task definition from crontab -l.
grep -E  "^([0-9]{2,2}|[*])( )([0-9]{2,2}|[*])" /exports/tmp/ip789

does not filter the line starting with*/2..
ozo
SILVER EXPERT
Most Valuable Expert 2014
Top Expert 2015

Commented:
*/2 * * * * /exports/tmp/gamer.sh
does not end with "ini"
ozo
SILVER EXPERT
Most Valuable Expert 2014
Top Expert 2015
Commented:
grep -E  "^([0-9]{2,2}|[*])( )([0-9]{2,2}|[*])" /exports/tmp/ip789
*/2 * * * * /exports/tmp/gamer.sh
has a / after the *, not a space
SILVER EXPERT
Most Valuable Expert 2013
Top Expert 2013
Commented:
As I wrote,

my "egrep" version works even with a slash following the asterisk, but of course (as ozo pointed out) only for lines ending with "ini".

grep -E  "^([0-9]{2,2}|[*]).*ini$" /exports/tmp/ip789

In case you want to see all job entries:

grep -E  "^([0-9]{1,2}|[*]).*" /exports/tmp/ip789

Please note that in this latter version the "minute" part can consist of one or two digits (besides the asterisk stuff, of course), which might better reflect the givens in a real crontab.
ozo
SILVER EXPERT
Most Valuable Expert 2014
Top Expert 2015
Commented:
grep -E  "^([0-9]{1,2}|[*]).*"
matches exactly the same lines as does
grep -E  "^[0-9*]"

If you want to see all cron job entries, some versions of cron also allow strings like
@daily
or
@hourly
for the time and date fields, and entries may have leading spaces and tabs
SILVER EXPERT
Most Valuable Expert 2013
Top Expert 2013

Commented:
Yep,

with this new font here at EE it's a bit hard (for writers as well as for readers) to verify whether there's a space somewhere or not.

grep -E  "^([0-9]{1,2} |[*]).*"

Now it's there.
ozo
SILVER EXPERT
Most Valuable Expert 2014
Top Expert 2015

Commented:
But now it fails on
14/2
 or
14,16
or
14-16
SILVER EXPERT
Most Valuable Expert 2013
Top Expert 2013
Commented:
grep -E  "^([   ]{0,}[0-9]{1,2}[ ,/-]|[*@]).*" /exports/tmp/ip789

Please note that inside the first square bracket pair there is a space and a TAB.

This might work as well somewhere, but not in my shell:

grep -E  "^([ \t]{0,}[0-9]{1,2}[ ,/-]|[*@]).*" /exports/tmp/ip789
SILVER EXPERT
Most Valuable Expert 2013
Top Expert 2013

Commented:
ozo,

please don't mention fcrontab. Please!

Author

Commented:
so if the user enter @daily or @monthly... then y pattern will fail to get the tasks.
Is there a pattern to fetch all valid tasks in crontab -l output ?
SILVER EXPERT
Most Valuable Expert 2013
Top Expert 2013

Commented:
Did you try my very last suggestion?

For crontab -l :

crontab -l | grep -E  "^([   ]{0,}[0-9]{1,2}[ ,/-]|[*@]).*"

Author

Commented:
yes.. it works.
but I am not yet sure on how to check if it can filter all valid task definition.
SILVER EXPERT
Most Valuable Expert 2013
Top Expert 2013

Commented:
See man 5 crontab.

Create a test file containing all the formats mentioned there, and check.

Author

Commented:
My final filter command is :
 crontab -l | grep -E  '^([   ]{0,}[0-9]{1,2}[ ,/-]|[*@]).*' | awk ' $0 !~ / backup \/usr\/lib\/backup / '

I call this inside a perl file
execute_command(crontab -l | grep -E  '^([   ]{0,}[0-9]{1,2}[ ,/-]|[*@]).*' | awk ' $0 !~ / backup \/usr\/lib\/backup / ');

But what I see is that because of the presence of $0 in awk filter the output is not as expected... How do i go ahead?
SILVER EXPERT
Most Valuable Expert 2013
Top Expert 2013

Commented:
Use grep instead of awk:

crontab -l | grep -E  '^([   ]{0,}[0-9]{1,2}[ ,/-]|[*@]).*' |grep -v "backup /usr/lib/backup"

Author

Commented:
@wmp
In the command: crontab -l | grep -E  "^([   ]{0,}[0-9]{1,2}[ ,/-]|[*@]).*"

what is between [ and ]   , is it a space or a tab??
SILVER EXPERT
Most Valuable Expert 2013
Top Expert 2013

Commented:
See comment #37719195 above.

It's a TAB.

Author

Commented:
In your comment you said "Please note that inside the first square bracket pair there is a space and a TAB."

What is the purpose of having both tab and space?
And why are you not advising to use [\s] rather?
ozo
SILVER EXPERT
Most Valuable Expert 2014
Top Expert 2015
Commented:
your grep may interpret that as matching either the character \ or the character s
but your grep may recognize [[:space:]]
SILVER EXPERT
Most Valuable Expert 2013
Top Expert 2013
Commented:
Correct,

[[:space:]] or [[:blank:]] should work!

crontab -l | grep -E  "^([ [:space:]]{0,}[0-9]{1,2}[ ,/-]|[*@]).*"

"[[:space:]]" means all "whitespace" characters, which includes TABs, and other "invisible" characters, like vertical TAB, LF etc. [[:blank:]] means just space and TAB.

>> What is the purpose of having both tab and space? <<

That's because a crontab entry might well have leading tabs, not only leading spaces.

>> why are you not advising to use [\s] rather? <<

That's because it doesn't work with my grep. Try it!

Author

Commented:
thanks a lot for your inputs

Explore More ContentExplore courses, solutions, and other research materials related to this topic.