[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 630
  • Last Modified:

shell script to read line by line and search for a string

Hi,

would like to have a shell script which should read a huge file line by line and search for two  strings. If it finds either one of the string then it has to write to another file

1. Shell script to accept source file name an argument.
2. Search two strings, If string1 is in current line then write it to a file. (No need to need search for string2 on the same line)
3. If string2 is in current line, then append it to the same above file

string1 and string2 cannot be on the same line.

Thanks in advance.
0
enthuguy
Asked:
enthuguy
  • 5
  • 5
  • 4
  • +1
1 Solution
 
farzanjCommented:
Try this:

USAGE: ./scriptname keyword1 keyword2 sourceFile targetFile

#!/bin/bash

FILE1=$3
FILE2=$4
KEY1=$1
KEY2=$2

cat $FILE1 | while read line
do
     if (( $(echo $line | grep -Ec "$KEY1|$KEY2") > 0 ))
     then
             echo $line >> $FILE2
     fi
done

Open in new window

0
 
ozoCommented:
grep $1 'string1\|string1' >> file
0
 
farzanjCommented:
Nice and quick solution Ozo except that it doesn't really read the file line by line :(
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
ozoCommented:
grep works line by line
0
 
simon3270Commented:
Strictly following requirements:

Usage findstrings.s Input_file

where findstrings.sh is:
#!/bin/sh

infile=$1
match1="string1_to_match"
match2="2nd string"
outfile="output.log"

while read line
do
  if echo $line | grep "match1"
  then
    :
  else
    echo $line | grep "$match2"
  fi
done < $infile > $outfile
0
 
farzanjCommented:
@Ozo:  Here is what I understand.    It works on Boyer-Moore algorithm, which doesn't work line by line.
http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm

The original writer of GNU grep utility says that it does NOT work line by line.
Here is the post.
http://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html

Do you have a reference that shows that grep actually works line by line?
0
 
ozoCommented:
man grep

DESCRIPTION
       Grep  searches  the named input FILEs (or standard input if no files are named, or the file
       name - is given) for lines containing a match to  the  given  PATTERN.
0
 
farzanjCommented:
Yes, that does not tell about the actual algorithm.  You are a big expert.  I don't have to tell you that you can find the keywords and then print the lines that contain it.  This statement is simply telling about the output not the algorithm.  The references I gave you the actual algorithms and one contains the statement of the person who actually programmed GNU grep.
0
 
ozoCommented:
I thought the question was a request for a specified output, not a request for a specified string matching algorithm.
(which none of the other answers has supplied either)
0
 
farzanjCommented:
@Ozo:  We look up to you.  Could you then show what would the right solution that would satisfy the requirements.  Many thanks.
0
 
ozoCommented:
It looks to me like all the answers satisfy the requirements.
The requirements do not say that the two strings and output file should be accepted as arguments,
but it does not explicitly forbid it either.

I do see two potential ambiguities in the question.
Are we meant to make a distinction between a "write" when string1 is found, and an "append" when string2 is found?
It is also not entirely clear whether "string1 and string2 cannot be on the same line." is a statement about the source file, or about the desired output.
If it is about the output, then how to deal with string1 and string2 on the same line in the input is unclear.
0
 
enthuguyAuthor Commented:
Thanks all for your suggestions and input.

Will try this tomo at work and update.

Thanks again
0
 
enthuguyAuthor Commented:
Thanks all

to clarify. string1 and string2 cannot be on the same line. sorry if I had confused you guys.

to easy readability...would like to have 2 separate log files for each string found.

Below is what I extended from above script given by ozo but for some reason it couldn't proceed further "while read line"  and it hangs

Please help

==================
#!/bin/sh

infile=$1
match1="success for login id"
match2="Incorrect password for login id"
outfile1="success_login.log"
outfile2="Incorrect_pwd.log"

echo $match1
echo $match2

while read line
do
if  echo $line | grep "$match1"
then
echo $line
echo $line >> $outfile1
elif  echo $line | grep "$match2"
then
echo $line
#echo $line >> $outfile2
else
echo "Did not find matching string"
fi
done

echo "Finished"
==================
0
 
simon3270Commented:
It was my script, rather than ozo's, but never mind.

The near-last line should be

    done < $infile

to read from that file.

Also, the hiddne trick in mine was that the grep lines did two things - it reutrned success if it foudn the line, and output the actual line.  You should change the if lines to be
    if  echo $line | grep -q "$match1"
The "-q" means that grep does the search, but doesn't output any matching lines.  Your "echo" a couple of lines lower does that.
0
 
enthuguyAuthor Commented:
apologies for referring wrong name

But thanks so much, I think I'm good for now.

One last request. if I would like to add a counter....say how many lines were found for string1 and string2 at the end of the file. how to achieve this?

btw, I will close this question anyway but if you could help on last counter thing....that would be great.

thanks again
0
 
simon3270Commented:
You can keep count of lines several ways, but one is to add:
    m1count=0
    m2count=0
before the "while" line, then around the lines where you echo the matching lines to the output files, add this (the spaces round the "+" are important)
    m1count=$(expr $m1count + 1)
immediately after
    echo $line >> $outfile1
(i.e. between that line and the "elif" line)
That's an old-fashioned way of doing maths - later shells (e.g. bash) allow things like:
    ((m2count=m2count+1))
(unlike the "expr" line, you don't need spaces round the symbols).

Then after the "done < $infile" line, have something like:

    echo Found $m1count lines with \"$match1\"
    echo Found $m2count lines with \"$match2\"

One last thing - if your "match" string may start with a hyphen (I just tried searching for the string "-v"), change the grep lines to:

    if  echo $line | grep -q -- "$match1"

The "--" tells grep (and almost all GNU programs) that you have stopped giving options, and everything after the "--" is to be treated as text arguments to the command.

I've also added "> $outfile1" etc to empty out the files - otherwise they will just get bigger every time you run the script.  If you don't mind that, and want to keep old records too, just omit those lines.  The script here will report the number of lines it has added in this run.

So, the final script looks like:
#!/bin/sh

infile=$1
match1="success for login id"
match2="Incorrect password for login id"
outfile1="success_login.log"
outfile2="Incorrect_pwd.log"

m1count=0
m2count=0

> $outfile1
> $outfile2

while read line
do
    if  echo $line | grep -q -- "$match1" 
    then
        echo $line >> $outfile1
        m1count=$expr($m1count + 1)
    elif  echo $line | grep -q -- "$match2" 
    then
        echo $line >> $outfile2
        ((m2count = m2count + 1))
    fi
done < $infile

echo Found $m1count instances of \"$match1\"
echo Found $m2count instances of \"$match2\"

echo "Finished"

Open in new window


One things about this code - it will be painfully slow on big files - ozo's version will be *much* quicker!  You can still get things like line counts - just do "wc -l < $outfile1".
0
 
simon3270Commented:
BTW, I had a quick look at the GNU grep source - it will normally search on large buffers, but some options (e.g. "-i" to ignore case) will force it to search line by line.  I think it uses Boyer-Moore in both cases.

Other greps (BSD, UNIX, Solaris) may well still have line-by-line searches.
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

  • 5
  • 5
  • 4
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now