bt707
asked on
Grep Limit
Is there a way to use Grep and set a limit on how far on each line it will search for the string you want.
Ex:
I want to search for a string only if it is in the first 200 characters of the line.
ASKER
that looks like it could do just what i need, don't care about it being truncated, but I'm getting this error when trying to run it.
awk '{if ( substr($0,1,200) ~ "@my.domain.com rfc822" ) {print $0}}' logfile1
awk: syntax error near line 1
awk: illegal statement near line 1
awk: syntax error near line 1
awk: bailing out near line 1
awk '{if ( substr($0,1,200) ~ "@my.domain.com rfc822" ) {print $0}}' logfile1
awk: syntax error near line 1
awk: illegal statement near line 1
awk: syntax error near line 1
awk: bailing out near line 1
You could use the cut command first:
#!/bin/bash
# Must have at least two args: pattern file...
if [ $# -lt 2 ] ; then
cat <<END
usage: $0 pattern files
Looks for regex pattern in first 200 chars of each file.
END
else
pattern=$1
shift 1
for f in "$@" ; do
tempf=`basename $f`.$$
grep -H -n ".*" $f | cut -d":" -f1,2 > $tempf
cut -c1-20 $f | paste -d":" $tempf - | grep -e "$pattern"
rm -f $tempf
done
fi
#!/bin/bash
# Must have at least two args: pattern file...
if [ $# -lt 2 ] ; then
cat <<END
usage: $0 pattern files
Looks for regex pattern in first 200 chars of each file.
END
else
pattern=$1
shift 1
for f in "$@" ; do
tempf=`basename $f`.$$
grep -H -n ".*" $f | cut -d":" -f1,2 > $tempf
cut -c1-20 $f | paste -d":" $tempf - | grep -e "$pattern"
rm -f $tempf
done
fi
Oops, that should be cut -c1-200
bt707, if I recall correctly, you're running Solaris; I've just tested it on Solaris 8 using nawk instead of basic awk, and nawk executes without errors:
nawk '{if ( substr($0,1,200) ~ "@my.domain.com rfc822" ) {print $0}}' logfile1
nawk '{if ( substr($0,1,200) ~ "@my.domain.com rfc822" ) {print $0}}' logfile1
This simplified version might work with standard Solaris awk - But I won't be able to test it until tomorrow now...
awk ' substr($0,1,200) ~ "@my.domain.com rfc822" {print $0}' logfile1
awk ' substr($0,1,200) ~ "@my.domain.com rfc822" {print $0}' logfile1
Posted again with minor corrections:
#!/bin/bash
# Must have at least two args: pattern file...
if [ $# -lt 2 ] ; then
cat <<END
usage: $0 pattern files
Looks for regex pattern in first 200 chars of each file.
END
else
pattern=$1
shift 1
for f in "$@" ; do
tempf=/tmp/`basename $f`.$$
grep -H -n ".*" $f | cut -d":" -f1,2 > $tempf
cut -c1-200 $f | paste -d":" $tempf - | grep -h -e "$pattern"
rm -f $tempf
done
fi
#!/bin/bash
# Must have at least two args: pattern file...
if [ $# -lt 2 ] ; then
cat <<END
usage: $0 pattern files
Looks for regex pattern in first 200 chars of each file.
END
else
pattern=$1
shift 1
for f in "$@" ; do
tempf=/tmp/`basename $f`.$$
grep -H -n ".*" $f | cut -d":" -f1,2 > $tempf
cut -c1-200 $f | paste -d":" $tempf - | grep -h -e "$pattern"
rm -f $tempf
done
fi
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
tfewster, thanks for all of your great answers, i used this last one and works great, just what I needed, haven't had time to try the other ones but sure they are all good ones.
Thanks again to All:
Thanks again to All:
cut -c1-200 filename |grep pattern
The obvious problem is that it will only return the truncated line...but you could use the first pass to generate the a list of matching line numbers and a second pass to extract those (complete) lines, e.g.
for LINE in `grep -n 1 filename | awk -F: '{printf $1 " "}'`
do
sed -n -s ${LINE}p filename
done
But that would be inefficient for large files, so maybe use awk instead...
awk '{if ( substr($0,1,200) ~ "PATTERN" ) {print $0}}' filename