I've been using counters logs in Performance Logs and Alerts to gather statistics on our 2003 servers. The specific metrics that I'm tracking are listed below.
I don't need any help with setting up the logs. That part is working well and I've got a few weeks' worth of data. Network performance has been fairly typical over this period.
My question is, what now? How do I analyze this information in a meaningful way so that I know what thresholds to set on alerts? I can calculate the maximum, minimum, and average values for each counter but that doesn't really tell me much.
Are there any good online guides or books which go into detail on this subject? I know the most common answers are "It depends," and "Every network is different," but I'm going to need a little more to go on.
All servers:
Memory\Available Bytes
Memory\Pages/sec
Memory\Transition Faults/sec
Network Interface(interface name)\Bytes Total/sec
PhysicalDisk(0 C:)\% Idle Time
PhysicalDisk(0 C:)\Avg. Disk Queue Length
PhysicalDisk(0 C:)\Disk Transfers/sec
Processor(_Total)\% Processor Time
Processor(_Total)\Interrup
ts/sec
System\Context Switches/sec
System\Processor Queue Length
Apache web servers:
Process(Apache#1)\% Processor Time
Domain Controllers:
Server\Errors Logon
Server\Errors System
Start Free Trial