Solved

grep and find the string filename including null spaces

Posted on 2010-11-24
14
941 Views
Last Modified: 2012-05-10
Hi Experts,

OS:REDHAT

How to grep the string " *.xsl " from a  .log file, then find that string named  file throughout ./Documents  folders and sub folders. And store those files in a separate folder(extractedfiles).
Some filenames also includes null spaces.
Commands would be more preferable as it is easy to execute right away without keeping my hand in the permissions..

Code:
for file in $(grep -rhZ ".xls"  ./thelogfiles/) ; do find ./Documents/ -iname "$file" -type f -print0 | xargs -0 -i cp '{}' ./extractedfiles ; done

So now, iam getting the output in "extractedfiles" folder. Except for the null spaces filename Ex:this that.xls.
Want "this that.xls" type of files also to be copied to ./extractedfiles
0
Comment
Question by:mail2vijay1982
  • 7
  • 7
14 Comments
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 34205386
Should be simple:

for file in "$(grep -rhZ ".xls"  ./thelogfiles/)" ; do find ./Documents/ -iname "$file" -type f -print0 | xargs -0 -i cp '{}' ./extractedfiles ; done

Note the quotes ( "  " ) around the $(grep .. ) expression!

wmp
0
 
LVL 4

Author Comment

by:mail2vijay1982
ID: 34206036
woolmilkporc:

no its not working, the null space filenames are not getting copied to the ./extractedfiles
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 34206061
I tested with spaces in filenames, and it works for me.

What do you mean then with "null space"? Space? Or Null? Or both?
0
 
LVL 4

Author Comment

by:mail2vijay1982
ID: 34206063
woolmilkporc:

Because of placing  quotes ( "  " ),  no files are copied to the output folder ./extractedfiles
(Atleast it copied the filename  "without spaces" before)

for file in "$(grep -rhZ ".xls"  ./thelogfiles/)" ; do find ./Documents/ -iname "$file" -type f -print0 | xargs -0 -i cp '{}' ./extractedfiles ; done
0
 
LVL 4

Author Comment

by:mail2vijay1982
ID: 34206080

woolmilkporc:


 filename like Ex:this that.xls.(files containing space)

Want "this that.xls" type of files also to be copied to ./extractedfiles
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 34206091
I just saw I did my tests with single quotes around '.xls'. Not sure if this is the reason - can't repeat my tests right now, sorry!

for file in "$(grep -rhZ '.xls'  ./thelogfiles/)" ; do find ./Documents/ -iname "$file" -type f -print0 | xargs -0 -i cp '{}' ./extractedfiles ; done
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 34206299
OK,

I see the problem now. I didn't have the time to set up a big test environment, so I used only one file - and that was misleading, because my version works with one file, but not with many.

Try this:

grep -rhZ '.xls'  ./thelogfiles/ | while read file ; do find ./Documents/ -iname "$file" -type f -print0 | xargs -0 -i cp '{}' ./extractedfiles ; done
0
Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 
LVL 4

Author Comment

by:mail2vijay1982
ID: 34206319
woolmilkporc:

nope, sorry

Because of placing  quotes ( "  " ),  no files are copied to the output folder ./extractedfiles
(Atleast it copied the filename  "without spaces" before)


Tried this,
for file in "$(grep -rhZ '.xls'  ./thelogfiles/)" ; do find ./Documents/ -iname "$file" -type f -print0 | xargs -0 -i cp '{}' ./extractedfiles ; done
0
 
LVL 4

Author Comment

by:mail2vijay1982
ID: 34206494

woolmilkporc:

Tried this,
for file in "$(grep -rhZ '.xls'  ./thelogfiles/)" ; do find ./Documents/ -iname "$file" -type f -print0 | xargs -0 -i cp '{}' ./extractedfiles ; done

Its working, but another issue have been raised,
 inside ./thelogfiles, i have .log files, so we are trying to search all string name  containing " *.xls" and find it as filename in ./Documents

So now, all files like
123.xls
2_3_4.xls
2 4.xls
are copied to the ./extractedfiles/

But, in certain .log files string name is like this..

file name   123.xls
file name    567.xls

these types of files are not copied to the ./extractedfiles/
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 34206651
The command in your last post is not what I suggested in my last post.
0
 
LVL 4

Author Comment

by:mail2vijay1982
ID: 34206840
woolmilkporc:

sorry tried this,

grep -rhZ '.xls'  ./thelogfiles/ | while read file ; do find ./Documents/ -iname "$file" -type f -print0 | xargs -0 -i cp '{}' ./extractedfiles ; done

Its working, but another issue have been raised,
 inside ./thelogfiles, i have .log files, so we are trying to search all string name  containing " *.xls" and find it as filename in ./Documents

So now, all files like
123.xls
2_3_4.xls
2 4.xls
are copied to the ./extractedfiles/

But, in certain .log files string name is like this..

file name   123.xls
file name    567.xls

these types of files are not copied to the ./extractedfiles/
0
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 500 total points
ID: 34207292
I assume "file name" in your last sample is not part of the filename itself but some ambiguous text?

If so, adressing such a problem would only be possible if there were no spaces in the actual filenames.
Without those spaces we could simply extract the last word in the line, but how should we know how many of those "last words" would  actually compose the filename?

Imagine this

file name 123.xls
file name this that.xls
abc def.xls
567.xls

Let's take the first line! What should we take as the filename? "123.xls"? "name 123.xls"?
Or the second line! You want to see "this that.xls", but it could be "that.xls" or even "name this that.xls"

We could take the last word only and let "find" do some kind of wildcard search, but I doubt if that's what you desire!


grep -rhZ '.xls'  ./thelogfiles/ | awk '{print $NF}' | while read file ; do find ./Documents/ -iname "*${file}*" -type f -print0 | xargs -0 -i cp '{}' ./extractedfiles ; done

0
 
LVL 4

Author Comment

by:mail2vijay1982
ID: 34207476
woolmilkporc:

Just worked like a charm,
Can you explain how adding awk  and "*${file}*" made the code work...
0
 
LVL 68

Assisted Solution

by:woolmilkporc
woolmilkporc earned 500 total points
ID: 34207622
I use "awk" to extract the last space-delimited field of the line containing ".xls".
NF is the number of fields in a line, $NF is consequently the content of that field.

This might or might not be a complete filename (remember the null spaces?) so we must tell
"find" not to search for an exact match, but for a wildcard match, which is achieved
by the asterisks ("*") in *${file}*

Let's take my example from above

file name this that.xls

"grep" finds this line due to ".xls". awk extracts "that.xls" so the final "find" command is

find ./Documents/ -iname "*that.xls*" -type f .. .. ..

A file named "that.xls" will be found, but a file named "this that.xls" or even "name this that.xls" will be found as well!

The whole search is now vague and imprecise (some people call this "fuzzy"), but will yield its results, as we can see.

But attention - the lines in question should always contain ".xls" somewhere inside their last space-delimited field, else the whole thing will become just too "fuzzy"!

Glad I could help!

Cheers

wmp


0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

I am a long time windows user and for me it is normal to have spaces in directory and file names. Changing to Linux I found myself frustrated when I moved my windows data over to my new Linux computer. The problem occurs when at the command line.…
The purpose of this article is to demonstrate how we can use conditional statements using Python.
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now