help using grep with a fixed string

heya guys:

I've got a logfile full of lines with varying regular expressions, and I need to grep through the file, looking for specific patterns

this is part of a bash script that searches through an ircd spam filter log, showing various details - here's the lines I'm having trouble with:

pattern="^FREE .+ pics and movies (www.siteA.da.ru|wWw.siteB.oRg)$"
grep -F "$pattern" spamfilter.log

spamfilter.log is filled with lines similar to this:

[Sun Jul  3 18:19:27 2005] - [Spamfilter] [|didosch|]!~wkngfhdvs@49f837.w82-123.abo.wanadoo.fr matches filter '^FREE .+ pics and movies (www\.siteA\.da\.ru|wWw\.siteB\.oRg)$': [PRIVMSG DarkReal: 'FREE Porn pics and movies www.siteA.da.ru'] [Spamming a porn url to users. Scan your pc for viruses.]

the various logfile lines have many different regex patterns, but that's an example of one of them

so, since what I want to search for is a regular expression, I can't use a regular expression to search for it easilly, so I use the -F switch to tell grep that it's a fixed string, but I can't get the correct results.. any ideas?



HiT5698Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ozoCommented:
Your pattern does not match the line in the file you show above

pattern="^FREE .+ pics and movies (www\.siteA\.da\.ru|wWw\.siteB\.oRg)$"
grep -F "$pattern" spamfilter.log

works on that example

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
HiT5698Author Commented:
well I changed the url name in the pattern, so people don't click it here (it has a virus).. and I forgot to put in the escapes..

but you've brought up a great point.. the script is stripping out backslashes on the regex patterns, preventing correct matches in many cases..

here's the code that handles that part: first I grep through the whole spamfilter logfile, and cpy only the needed regex patterns into a temp file, and that temp file does have the backslashes preserved.. but then when the script gets to this:

($TEMPF is the temp file with only regex patterns in it, one per line):

{
cat $TEMPF|while read pattern ; do
  PCOUNT=$(grep -Fc "$pattern" $LOGF)
  LDATE=$(grep -F "$pattern" $LOGF|tail -n1|awk '{print $2" "$3" "$5}')
  PACTIVE=$(grep -Fm1 "$pattern" $CONF)
  [ "$PACTIVE" ] && TAG=[*] || TAG=[!]
  echo ""$n"a"$TAG". pattern: $pattern"    
  echo ""$n"b"$TAG". hits: ${PCOUNT:-0} [last: ${LDATE:-n/a]}"
  echo
  n=$(($n + 1))
done
} >> $OUTF

now the whole problem with that loop is that it strips the backslashes out of $pattern for some reason, so when I grep for $pattern, it usually is the wrong one (unless the original pattern had no backslashes to begin with, then those will work).. but why is that loop stripping the backslashes from $pattern?

to make things easier, here is the entire script:

#!/bin/sh
# makes a report on spamfilter hits

CONF=spamfilter.conf
LOGF=spamfilter.log
TEMPF=$(basename $0).tmp
URC=unrealircd.conf
OUTF=$(basename $0).log
STAMP=$(date '+%F %r %z')
SNAME=$(grep -Eom1 "name \".+\"" $URC|awk '{print $2}'|awk -F\" '{print $2}')
n=1

if [ ! -e "$LOGF" ] ; then
  echo "ERROR: Cannot find $LOGF"
  exit 0
fi

SDATE=$(grep -m1 "\- \[Spamfilter\]" $LOGF|awk '{print $2" "$3}')

# total hits in logfile (all filters)
HC=$(grep -Ec "\- \[Spamfilter\]" $LOGF)

# dump unique patterns from log into temp file
grep -E "\- \[Spamfilter\]" $LOGF|grep -Eo "matches filter '.*':"|awk -F\' '{print $2}'|sort|uniq > $TEMPF

# unique patterns found in log
UCOUNT=$(sed -n '$=' $TEMPF)

echo "# Created: $STAMP" > $OUTF
echo "# IRCd: $SNAME" >> $OUTF
echo "" >> $OUTF
echo "- $LOGF has $HC total spamfilter hits," >> $OUTF
echo "- from $UCOUNT unique patterns, since $SDATE" >> $OUTF
echo "- Prefix legend: [*] = active, [!] = not active" >> $OUTF
echo "" >> $OUTF

{
cat $TEMPF|while read pattern ; do
  PCOUNT=$(grep -Fc "$pattern" $LOGF)
  LDATE=$(grep -F "$pattern" $LOGF|tail -n1|awk '{print $2" "$3" "$5}')
  PACTIVE=$(grep -Fm1 "$pattern" $CONF)
  [ "$PACTIVE" ] && TAG=[*] || TAG=[!]
  echo ""$n"a"$TAG". pattern: $pattern"
  echo ""$n"b"$TAG". hits: ${PCOUNT:-0} [last: ${LDATE:-n/a]}"
  echo
  n=$(($n + 1))
done
} >> $OUTF

cat $OUTF

rm -f $OUTF
rm -f $TEMPF
exit 0

HiT5698Author Commented:
using that code above, I ran that with this exact logfile (spamfilter.log):

[Sun Jul  3 18:19:27 2005] - [Spamfilter] [|didosch|]!~wkngfhdvs@d8475.w82-123.abo.wanadoo.fr matches filter '^FREE .+ pics and movies (www\.pornsites\.da\.ru|wWw\.aLmoRa\.oRg)$': [PRIVMSG DarkReal: 'FREE Porn pics and movies www.pornsites.da.ru'] [Spamming a porn url to users. Scan your pc for viruses.]
[Mon Jul  4 07:49:30 2005] - [Spamfilter] GurBeT!kedoozwrkq@23.135.11.3 matches filter '^For Hot Girl & Crazy PØrn Movîes & HardcØre PØrn MØviés Click Red Box : (http://)?.+\.almora\.org$': [PRIVMSG SixPacK: 'For Hot Girl & Crazy PØrn Movîes & HardcØre PØrn MØviés Click Red Box : www.almora.org'] [Spamming a porn url, scan your pc for viruses]
[Mon Jul  4 17:36:14 2005] - [Spamfilter] ATeS!~unicfvukg@12.186.170.89 matches filter '^For Hot Girl & Crazy PØrn Movîes & HardcØre PØrn MØviés Click Red Box : (http://)?.+\.almora\.org$': [PRIVMSG MERR50: 'For Hot Girl & Crazy PØrn Movîes & HardcØre PØrn MØviés Click Red Box : www.almora.org'] [Spamming a porn url, scan your pc for viruses]
[Mon Jul  4 20:48:35 2005] - [Spamfilter] ^Linda!~wxenolilr@3qwes-152-1-28-236.w82-123.abo.wanadoo.fr matches filter '^FREE .+ pics and movies (www\.pornsites\.da\.ru|wWw\.aLmoRa\.oRg)$': [PRIVMSG VOH|out: 'FREE Porn pics and movies www.pornsites.da.ru'] [Spamming a porn url to users. Scan your pc for viruses.]

and using that exact logfile, here is the script's output:

ircd@drt:~/urleaf$ ./wnsflist
# Created: 2005-07-05 05:26:28 AM +0200
# IRCd: someserver.testnet.org

- spamfilter.log has 4 total spamfilter hits,
- from 2 unique patterns, since Jul 3
- Prefix legend: [*] = active, [!] = not active

1a[!]. pattern: ^FREE .+ pics and movies (www.pornsites.da.ru|wWw.aLmoRa.oRg)$
1b[!]. hits: 0 [last: n/a]

2a[!]. pattern: ^For Hot Girl & Crazy PØrn Movîes & HardcØre PØrn MØviés Click Red Box : (http://)?.+.almora.org$
2b[!]. hits: 0 [last: n/a]


it's not supposed to be possible for there to be 0 hits (every pattern has atleast 1 hit of course)
HiT5698Author Commented:
nevermind I found the answer.. just had to use read -r instead of read.. but ozo, your observation was very helpful (don't know how I missed it before), so it looks like you get the points ;)
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Linux OS Dev

From novice to tech pro — start learning today.