Bash: Faster grep in many files from several strings in a file

I have the following working script to grep in a directory of Many files from some specific strings previously saved into a file.

I use the files extension to grep all files as its name are random and note that every string from my previously file should be searched in all the files.

Also, I cut the outputting grep as it return 2 or 3 lines of the matched file and I only want a specific part that shows the filename.

I might be using something redundant, how it could be faster?

#!/bin/bash
#working but slow
cd /var/FILES_DIRECTORY
while read line
do
LC_ALL=C fgrep "$line" *.cps | cut -c1-27 >> /var/tmp/test_OUT.txt
done < "/var/tmp/test_STRINGS.txt"
joaotellesAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Anthony GarciaDevops StaffCommented:
If you are running on a machine with a multicore cpus, then you can take advantage of using parallel.

http://www.gnu.org/software/parallel/man.html#example__parallel_grep

#!/bin/bash
#working but slow
cd /var/FILES_DIRECTORY
while read line
do
find . -type f | parallel -k -j150% grep -H $line {} |cut -c1-27 >> /var/tmp/test_OUT.txt
done < "/var/tmp/test_STRINGS.txt" 

Open in new window


Something like this might work. I haven't had a chance to test it yet, since I am not sure what the files look like.
0
TintinCommented:
What about just doing

fgrep -lf /var/tmp/test_STRINGS.txt /var/FILES_DIRECTORY/*.cps >/var/tmp/test_OUT.txt

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
joaotellesAuthor Commented:
Thanks, its really faster.

I am running,

fgrep -lf /var/tmp/$var1/test_STRINGS.txt /var/FILES_DIRECTORY/*var2*.cps |cut -c1-27 > /var/tmp/test_OUT.txt

Inside a Loop (for) but something strange is happening,

It works for the first one but then, the following increments results,

fgrep: can't open /var/FILES_DIRECTORY/*var2*.cps

Var 1 and Var2 is being placed correctly, I can see in different parts of the scripts.

But it shouldn't open this file, it should open "/var/tmp/$var1/test_STRINGS.txt"

Do you have any insight?
0
TintinCommented:
What does the for loop look like?
0
joaotellesAuthor Commented:
The following code works for the day before log, before the day before and crashes in the third one.

fgrep: can't open *20130929*.file

#!/bin/bash
logs_dir=/var/XXX 
files_dir=/var/XXX 

cd $logs_dir;  
ls -t | head -n 4 | grep 2013 > logs_names.txt    #Current name and 3 days before.

for ((i=2; i<=4; i+=1)); do 

cd $logs_dir
daylog=$(sed -n ''$i'p' logs_names.txt)   #read from file 
day_info=$(sed -n ''$i'p' logs_names.txt | cut -c10-17)  # e.g 20131002
arquivo=`ls | grep "$daylog"`  #file name to be manipulated 

more $arquivo | grep "MATCH STRING IN LOG FILE" | cut -c184-202 > /var/tmp/GREP_result_$day_info.log

cd $files_dir
fgrep -lf /var/tmp/GREP_result_$day_info.log *$day_info*.file | cut -c56-82 > /var/tmp/FILES_name_result_$day_info.txt

done

Open in new window

0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Algorithms

From novice to tech pro — start learning today.