Solved

Shell Script To Monitor Webpage For specific word

Posted on 2010-08-26
19
1,056 Views
Last Modified: 2013-12-21
I want to have a shell script monitor a webpage for the word "Offline" and have it e-mail me if the word appears on the page.  How can I do this? Thanks
0
Comment
Question by:newmacguy
  • 7
  • 5
  • 3
  • +3
19 Comments
 
LVL 40

Expert Comment

by:omarfarid
Comment Utility
you may monitor it with wget (see man wget) to get the page and then use grep for that word and if it is there then use mail or mailx o send mail
0
 

Author Comment

by:newmacguy
Comment Utility
Any chance you can provide some sample code? I'm not very good at shell scripting. Thanks
0
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
lynx might be easier:

lynx -dump http://my.url.com | grep -q "Offline" && echo "$(date) Offline Alert!" | mailx -s 'String "Offline" found!' newmacguy@domain.tld

wmp
0
 

Author Comment

by:newmacguy
Comment Utility
I tried that but I get this result:
  Not Acceptable

   An appropriate representation of the requested resource / could not be
   found on this server.

   Additionally, a 404 Not Found error was encountered while trying to use
   an ErrorDocument to handle the request.

This command works on other websites, but not this one for some reason.
0
 
LVL 18

Expert Comment

by:TobiasHolm
Comment Utility
406 Error (Not Acceptable)

Are you seeing this error when trying to access a page? This is due to Apache mod_security that is turned on by default. While you can use the following to diagnose the problem (turning the filter off should resolve the issue):
SecFilterEngine off

It's important to leave the filter on as it helps prevent spam and injection attacks. What you need to do is figure out (ask your hosting provider if you need help) the phrases that are triggering the filter. Typically these will include phrases that include possible commands such as "ftp," "telnet," "ssh," etc. Once you know which ones are, you can modify the filter to permit the page that is not loading (scan your logs or Google sitemap for 406 errors to find all the pages experiencing this problem):

Ref: http://community.contractwebdevelopment.com/406-error-not-acceptable-an-appropriate-representation-of-the-requested-resource-could-not-be-found-on-this-server
0
 

Author Comment

by:newmacguy
Comment Utility
I don't control that particular web server.  I'm trying the wget method as that seems to work.  How can I make it send the e-mail if the grep returns results from the wgetted file?
0
 
LVL 40

Expert Comment

by:omarfarid
Comment Utility
try this

wget http://domain.com/file.html
grep -w Offline file.html
if [ $? -eq 0 ]
then
      mailx -s "Offline found" user@yourdomain
fi
rm file.html
0
 

Author Comment

by:newmacguy
Comment Utility
I'm getting an error on the if statement line.
0
 

Author Comment

by:newmacguy
Comment Utility
line 3: [1: command not found
0
Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

 
LVL 18

Expert Comment

by:TobiasHolm
Comment Utility
Try omarfarid's code in a bash-script. But I suppose you know how to make such script-file?

In case you don't:

Put the code in a file. Make the file executable with: chmod a+x yourfilename.sh

Then run the script with: ./yourfilename.sh
#!/bin/bash
wget http://domain.com/file.html
grep -w Offline file.html
if [ $? -eq 0 ]
then
      mailx -s "Offline found" user@yourdomain
fi
rm file.html

Open in new window

0
 
LVL 40

Expert Comment

by:omarfarid
Comment Utility
did you copy the script exact ? what shell are you using?
0
 
LVL 40

Expert Comment

by:omarfarid
Comment Utility
you need to leave space before $?
0
 

Author Comment

by:newmacguy
Comment Utility
I copied it exactly. I think the problem is that there are some $ symbols in the grepped text
0
 
LVL 40

Expert Comment

by:omarfarid
Comment Utility
the error is related to space before $?
can you post the script you run?
0
 
LVL 16

Expert Comment

by:gelonida
Comment Utility
The  '[' AND ']'  need to be separated by a space from preceeding and following words
as omarfarid mentioned


please note also, you have to remove file.html after having grepped for it
otherwise the next wget command would save the result as file.html.1

if you want to keep the html file, then yyou could add following command
between line 6 and 7 of TobiasHolm's script

cp file.html saved_file.html


0
 
LVL 3

Expert Comment

by:egarciat
Comment Utility
You may want to try:


#!/bin/bash

alert_string="offline"

alertme() {
mailx -s "Offline found" user@yourdomain
}

wget -t 1 -T 5 --quiet -O - "http://yourdomain.com" 2>/dev/null | grep -i "$alert_string" && alertme

Open in new window

0
 
LVL 3

Expert Comment

by:egarciat
Comment Utility
The "-t" argument to wget is the number of tries it does before exiting, if you do not set this param, I think wget tries several times (which may be desired), the "-T" is the time out time in secons, if you do not specify this and for some reason the page is not reachable, wget may block your script for some predefined default wich I think is 45 secs.

You can specify whathever you want in the function "alertme".. personally I use sendmail for such operations, but any other program may work..

alert() {
echo -e "Subject: Offline Alert " | sendmail -falert@yourdomain.com youruser@youremaildomain &
}
0
 
LVL 3

Accepted Solution

by:
egarciat earned 500 total points
Comment Utility
I Apologize for posting again..

I forgot to tell you that altough not necesary, I would be a good idea to set the content type of the web page you are downloading to "text/plain", you may want to use an HTML META tag.
0
 

Author Closing Comment

by:newmacguy
Comment Utility
Sorry for being away so long. This appears to have fixed it though.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Suggested Solutions

This document is written for Red Hat Enterprise Linux AS release 4 and ORACLE 10g.  Earlier releases can be installed using this document as well however there are some additional steps for packages to be installed see Metalink. Disclaimer: I hav…
The purpose of this article is to show how we can create Linux Mint virtual machine using Oracle Virtual Box. To install Linux Mint we have to download the ISO file from its website i.e. http://www.linuxmint.com. Once you open the link you will see …
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
This video shows how to remove a single email address from the Outlook 2010 Auto Suggestion memory. NOTE: For Outlook 2016 and 2013 perform the exact same steps. Open a new email: Click the New email button in Outlook. Start typing the address: …

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now