shell script to read line by line and search for a string

Hi,

would like to have a shell script which should read a huge file line by line and search for two  strings. If it finds either one of the string then it has to write to another file

1. Shell script to accept source file name an argument.
2. Search two strings, If string1 is in current line then write it to a file. (No need to need search for string2 on the same line)
3. If string2 is in current line, then append it to the same above file

string1 and string2 cannot be on the same line.

Thanks in advance.
enthuguyAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

farzanjCommented:
Try this:

USAGE: ./scriptname keyword1 keyword2 sourceFile targetFile

#!/bin/bash

FILE1=$3
FILE2=$4
KEY1=$1
KEY2=$2

cat $FILE1 | while read line
do
     if (( $(echo $line | grep -Ec "$KEY1|$KEY2") > 0 ))
     then
             echo $line >> $FILE2
     fi
done

Open in new window

ozoCommented:
grep $1 'string1\|string1' >> file
farzanjCommented:
Nice and quick solution Ozo except that it doesn't really read the file line by line :(
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

ozoCommented:
grep works line by line
simon3270Commented:
Strictly following requirements:

Usage findstrings.s Input_file

where findstrings.sh is:
#!/bin/sh

infile=$1
match1="string1_to_match"
match2="2nd string"
outfile="output.log"

while read line
do
  if echo $line | grep "match1"
  then
    :
  else
    echo $line | grep "$match2"
  fi
done < $infile > $outfile
farzanjCommented:
@Ozo:  Here is what I understand.    It works on Boyer-Moore algorithm, which doesn't work line by line.
http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm

The original writer of GNU grep utility says that it does NOT work line by line.
Here is the post.
http://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html

Do you have a reference that shows that grep actually works line by line?
ozoCommented:
man grep

DESCRIPTION
       Grep  searches  the named input FILEs (or standard input if no files are named, or the file
       name - is given) for lines containing a match to  the  given  PATTERN.
farzanjCommented:
Yes, that does not tell about the actual algorithm.  You are a big expert.  I don't have to tell you that you can find the keywords and then print the lines that contain it.  This statement is simply telling about the output not the algorithm.  The references I gave you the actual algorithms and one contains the statement of the person who actually programmed GNU grep.
ozoCommented:
I thought the question was a request for a specified output, not a request for a specified string matching algorithm.
(which none of the other answers has supplied either)
farzanjCommented:
@Ozo:  We look up to you.  Could you then show what would the right solution that would satisfy the requirements.  Many thanks.
ozoCommented:
It looks to me like all the answers satisfy the requirements.
The requirements do not say that the two strings and output file should be accepted as arguments,
but it does not explicitly forbid it either.

I do see two potential ambiguities in the question.
Are we meant to make a distinction between a "write" when string1 is found, and an "append" when string2 is found?
It is also not entirely clear whether "string1 and string2 cannot be on the same line." is a statement about the source file, or about the desired output.
If it is about the output, then how to deal with string1 and string2 on the same line in the input is unclear.
enthuguyAuthor Commented:
Thanks all for your suggestions and input.

Will try this tomo at work and update.

Thanks again
enthuguyAuthor Commented:
Thanks all

to clarify. string1 and string2 cannot be on the same line. sorry if I had confused you guys.

to easy readability...would like to have 2 separate log files for each string found.

Below is what I extended from above script given by ozo but for some reason it couldn't proceed further "while read line"  and it hangs

Please help

==================
#!/bin/sh

infile=$1
match1="success for login id"
match2="Incorrect password for login id"
outfile1="success_login.log"
outfile2="Incorrect_pwd.log"

echo $match1
echo $match2

while read line
do
if  echo $line | grep "$match1"
then
echo $line
echo $line >> $outfile1
elif  echo $line | grep "$match2"
then
echo $line
#echo $line >> $outfile2
else
echo "Did not find matching string"
fi
done

echo "Finished"
==================
simon3270Commented:
It was my script, rather than ozo's, but never mind.

The near-last line should be

    done < $infile

to read from that file.

Also, the hiddne trick in mine was that the grep lines did two things - it reutrned success if it foudn the line, and output the actual line.  You should change the if lines to be
    if  echo $line | grep -q "$match1"
The "-q" means that grep does the search, but doesn't output any matching lines.  Your "echo" a couple of lines lower does that.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
enthuguyAuthor Commented:
apologies for referring wrong name

But thanks so much, I think I'm good for now.

One last request. if I would like to add a counter....say how many lines were found for string1 and string2 at the end of the file. how to achieve this?

btw, I will close this question anyway but if you could help on last counter thing....that would be great.

thanks again
simon3270Commented:
You can keep count of lines several ways, but one is to add:
    m1count=0
    m2count=0
before the "while" line, then around the lines where you echo the matching lines to the output files, add this (the spaces round the "+" are important)
    m1count=$(expr $m1count + 1)
immediately after
    echo $line >> $outfile1
(i.e. between that line and the "elif" line)
That's an old-fashioned way of doing maths - later shells (e.g. bash) allow things like:
    ((m2count=m2count+1))
(unlike the "expr" line, you don't need spaces round the symbols).

Then after the "done < $infile" line, have something like:

    echo Found $m1count lines with \"$match1\"
    echo Found $m2count lines with \"$match2\"

One last thing - if your "match" string may start with a hyphen (I just tried searching for the string "-v"), change the grep lines to:

    if  echo $line | grep -q -- "$match1"

The "--" tells grep (and almost all GNU programs) that you have stopped giving options, and everything after the "--" is to be treated as text arguments to the command.

I've also added "> $outfile1" etc to empty out the files - otherwise they will just get bigger every time you run the script.  If you don't mind that, and want to keep old records too, just omit those lines.  The script here will report the number of lines it has added in this run.

So, the final script looks like:
#!/bin/sh

infile=$1
match1="success for login id"
match2="Incorrect password for login id"
outfile1="success_login.log"
outfile2="Incorrect_pwd.log"

m1count=0
m2count=0

> $outfile1
> $outfile2

while read line
do
    if  echo $line | grep -q -- "$match1" 
    then
        echo $line >> $outfile1
        m1count=$expr($m1count + 1)
    elif  echo $line | grep -q -- "$match2" 
    then
        echo $line >> $outfile2
        ((m2count = m2count + 1))
    fi
done < $infile

echo Found $m1count instances of \"$match1\"
echo Found $m2count instances of \"$match2\"

echo "Finished"

Open in new window


One things about this code - it will be painfully slow on big files - ozo's version will be *much* quicker!  You can still get things like line counts - just do "wc -l < $outfile1".
simon3270Commented:
BTW, I had a quick look at the GNU grep source - it will normally search on large buffers, but some options (e.g. "-i" to ignore case) will force it to search line by line.  I think it uses Boyer-Moore in both cases.

Other greps (BSD, UNIX, Solaris) may well still have line-by-line searches.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Programming

From novice to tech pro — start learning today.