Monitor application and report hang or failure

I have a critical application running on a Windows XP machine that is sometimes crashing on me.  The application name is coyote.exe and it locks up at random times and I would like to execute a blat email command for the system to email me when this happens.  I need a script or batch that can detect when the application hangs or fails and then triggers blat to email.  Windows XP is recording the hang in the application event log so maybe that is where we detect it?  I am not sure.
murrycAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Bill PrewIT / Software Engineering ConsultantCommented:
Since you indicate it "hangs" and stays in memory, it could be a little hard to accurately detect a hang condition.  It there any routine activity that it performsn that could be an indicator, if a file is not processed in a certain amount of time, etc?

What specifically does it log to the Windows event log, and when does it do that, as soon as it hangs, or some time after, etc?

~bp
0
murrycAuthor Commented:
Not sure about the process that we could monitor, but I might be able to track down something if we can't act on the event log entry.  It records the following in the application log in Windows.

Event Type:      Error
Event Source:      Application Error
Event Category:      None
Event ID:      1000
Date:            5/11/2015
Time:            1:45:14 PM
User:            N/A
Computer:      DRR
Description:
Faulting application coyote.exe, version 2.14.1.10, faulting module ntdll.dll, version 5.1.2600.6055, fault address 0x00011689.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 41 70 70 6c 69 63 61 74   Applicat
0008: 69 6f 6e 20 46 61 69 6c   ion Fail
0010: 75 72 65 20 20 63 6f 79   ure  coy
0018: 6f 74 65 2e 65 78 65 20   ote.exe
0020: 32 2e 31 34 2e 31 2e 31   2.14.1.1
0028: 30 20 69 6e 20 6e 74 64   0 in ntd
0030: 6c 6c 2e 64 6c 6c 20 35   ll.dll 5
0038: 2e 31 2e 32 36 30 30 2e   .1.2600.
0040: 36 30 35 35 20 61 74 20   6055 at
0048: 6f 66 66 73 65 74 20 30   offset 0
0050: 30 30 31 31 36 38 39 0d   0011689.
0058: 0a                        .
0
Bill PrewIT / Software Engineering ConsultantCommented:
The problem I see with checking the event log is due to the fact that these never get deleted.  So how far back should the "watchdog" process look?  You would only be able to look back as far as the last time the watchdog ran, otherwise you could pick up an earlier failure that has already been recovered from.

How often were you thinking of watchdog run?

It sounds like you are already skilled in BLAT usage?

~bp
0
Starting with Angular 5

Learn the essential features and functions of the popular JavaScript framework for building mobile, desktop and web applications.

murrycAuthor Commented:
Let me think about that. I see your point. Let me see if we can monitor a file change. I do understand blat.
0
murrycAuthor Commented:
Ok, it appears that coyote.exe stops running and dissappears from the running processes when it hangs.  So can we monitor the existence of that process and execute blat if it does not exist?  I would like to keep a batch or script running 24/7 that checks every 1-2 mins for the process.  Either that or run a scheduled task.  Which would be best?
0
Bill PrewIT / Software Engineering ConsultantCommented:
Here is a simple BAT script that will check if the process is running, and if not can invoke BLAT to send an email.

I would recommend the Task Scheduler approach, there is no true "SLEEP" command in DOS batch scripts, so anything that we did to make the BAT script spin would consume resources.  Better to just run it every few minutes out of Task Scheduler.

Something to keep in mind, with this approach, once the task errors out, you will keep getting an email every minute (or whatever frequency you schedule it to run) until it restarted.  That might be a problem?  We could add some additional logic, probably setting a flag in a control file on disk, where once we send the email we update the control file, and don't send subsequent emails until the flag is reset.  The same script could reset the flag the first time it runs and the task is found again.  Let me knoew if you think that is needed.

@echo off
setlocal

REM Define task to check for
set Taskname=coyote.exe

REM Check for the task still running
tasklist /FI "imagename eq %TaskName%" | find /I "%TaskName%" >NUL 2>&1 && set Stopped=N || set Stopped=Y

REM If stopped then send email
if "%Stopped%" EQU "Y" (
  REM Add logic here for BLAT email send
)

Open in new window

~bp
0
Bill PrewIT / Software Engineering ConsultantCommented:
Here is an example of an approach to suppress numerous emails per each outage.  It uses a control file to know if an email has already been sent for the current outage.  See if this makes sense.  It locates the file in the same folder as the BAT file, with the same base name as the script, but with an extension of CTL.  When it sees the EXE not running, it sends an email, and creates the file.  Then if it runs again, sees the EXE missing, but the CTL file exists, it doesn't email.  When the EXE is next seen by this script, it deletes the CTL file so that an email will be sent if a new outage occurs.

@echo off
setlocal

REM Define task to check for, and name of control file to suppress duplicate emails
set Taskname=coyote.exe
set Control=%~dpn0.ctl

REM Check for the task still running
set SendEmail=N
tasklist /FI "imagename eq %TaskName%" | find /I "%TaskName%" >NUL 2>&1 && (
  if exist "%Control%" del "%Control%">NUL 2>NUL
) || (
  REM Only send one email per outage
  if not exist "%Control%" (
    echo.>"%Control%"
    set SendEmail=Y
  ) 
)

REM If stopped then send email
if "%SendEmail%" EQU "Y" (
  REM Add logic here for BLAT email send
)

Open in new window

~bp
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
murrycAuthor Commented:
Sorry of the delay Bill.  I didn't realize that I had left this open.
0
Bill PrewIT / Software Engineering ConsultantCommented:
Thanks for getting back to this and for the feedback, glad my input was useful.

~bp
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
VB Script

From novice to tech pro — start learning today.