asked on

Grep Limit

Is there a way to use Grep and set a limit on how far on each line it will search for the string you want.

Ex:
I want to search for a string only if it is in the first 200 characters of the line.

tfewster

How about:
cut -c1-200 filename |grep pattern

The obvious problem is that it will only return the truncated line...but you could use the first pass to generate the a list of matching line numbers and a second pass to extract those (complete) lines, e.g.

for LINE in `grep -n 1 filename | awk -F: '{printf $1 " "}'`
do
sed -n -s ${LINE}p filename
done

But that would be inefficient for large files, so maybe use awk instead...

awk '{if ( substr($0,1,200) ~ "PATTERN" ) {print $0}}' filename

bt707

ASKER

that looks like it could do just what i need, don't care about it being truncated, but I'm getting this error when trying to run it.

awk '{if ( substr($0,1,200) ~ "@my.domain.com rfc822" ) {print $0}}' logfile1
awk: syntax error near line 1
awk: illegal statement near line 1
awk: syntax error near line 1
awk: bailing out near line 1

brettmjohnson

You could use the cut command first:

#!/bin/bash
# Must have at least two args: pattern file...

if [ $# -lt 2 ] ; then
cat <<END
usage: $0 pattern files
Looks for regex pattern in first 200 chars of each file.
END

else
pattern=$1
shift 1

for f in "$@" ; do
tempf=`basename $f`.$$
grep -H -n ".*" $f | cut -d":" -f1,2 > $tempf
cut -c1-20 $f | paste -d":" $tempf - | grep -e "$pattern"
rm -f $tempf
done
fi

brettmjohnson

Oops, that should be cut -c1-200

tfewster

bt707, if I recall correctly, you're running Solaris; I've just tested it on Solaris 8 using nawk instead of basic awk, and nawk executes without errors:

nawk '{if ( substr($0,1,200) ~ "@my.domain.com rfc822" ) {print $0}}' logfile1

tfewster

This simplified version might work with standard Solaris awk - But I won't be able to test it until tomorrow now...
awk ' substr($0,1,200) ~ "@my.domain.com rfc822" {print $0}' logfile1

brettmjohnson

Posted again with minor corrections:

#!/bin/bash
# Must have at least two args: pattern file...
if [ $# -lt 2 ] ; then
cat <<END
usage: $0 pattern files
Looks for regex pattern in first 200 chars of each file.
END

else
pattern=$1
shift 1

for f in "$@" ; do
tempf=/tmp/`basename $f`.$$
grep -H -n ".*" $f | cut -d":" -f1,2 > $tempf
cut -c1-200 $f | paste -d":" $tempf - | grep -h -e "$pattern"
rm -f $tempf
done
fi

ASKER CERTIFIED SOLUTION

tfewster

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

bt707

ASKER

tfewster, thanks for all of your great answers, i used this last one and works great, just what I needed, haven't had time to try the other ones but sure they are all good ones.

Thanks again to All: