I have setup Nagios under linux to monitor all our devices
I need to monitor our windows servers cpu load over SNMP.
I can do this, however, windows only reports the *Current* CPU Load, not a load average over time.
The problem with this, is there may be a process that is using 100% cpu at the instant that the snmp polls. Whereas I would like to take an average over say 5 minutes.
This would give a better indication that a runaway process is hogging the CPU.
Now, I have written a program/script that polls the server over x periods at y intervals and this works fine, however, nagios times out trying to run the process as normal processes return their results within 1 minute and I may need this script/process to wait for around 15 minutes
I can increase this limit by changing the service_check_timeout variable to allow a greater time than 60 seconds.
That solves the problem with nagios timing out the processes.
My problem is now, that normally nagios check processes, unlike my process, return quickly and don't stay active sleeping / waiting for each check to get an average.
I know this is a complicated problem, but I was simply wondering if you can somehow configure nagios to work with checks over time, or some other better way than this.
I DO NOT want to use the nagios service on our windows servers, ONLY SNMP.