We already have a statistical tracking system in place that is pretty thorough, my issue is: what to do with all that data at my disposal?
Main Topics
Browse All TopicsI need to come up with 4-6 metrics (charts, essentially) that will help our organization monitor the "health" of a system that processes large amounts of data. This system is mission critical, and issues on it tend to cascade out to connected systems, so metrics that would indicate an imminent failure of some kind or a significant trend toward a resource bottleneck would be a big plus. I've found some metrics examples online by digging through a lot of irrelevant chaff, but I'm uncertain how informative they would be. The system is comprised of servers arranged in complementary pairs (#1's load fails over to #2, #3 fails to #4, etc.) with some minor load balancing, and the system as a whole pulls data in through queues, processes it, then sends it out to other systems.
Any suggestions?
Please indicate what type of charts or metrics you use, how long you've used them, and how informative or helpful you feel they have been.
This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.
Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.
If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.
Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.
Access the answers to your technology questions today.
30-day free trial. Register in 60 seconds.
Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Try it out and discover for yourself.
30-day free trial. Register in 60 seconds.
Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.
I don't need to know how to get the data, that part is largely taken care of. I need suggestions on what I should track, not how. I'm looking for something like: "It's not necessary to track CPU, Disk IO, and MEM seperately. In my experience, combining the three values with equal weighting applied to all three is just as informative and simpler to follow."
I would track all 3 independently, but this is very application dependent.
I normally start with a baseline set of measurements, and then after I am sure how things are working, I start to simplify.
We Monitor, CPU, reboots, specific processes and services, RAM, DIsk space, and applications.
I hope this helps !
Business Accounts
Answer for Membership
by: SysExpertPosted on 2009-01-09 at 11:54:57ID: 23339578
Perfmon can be used for this in real time or logging mode, with alerts
read the help file for best practices, but you will need to define specific counters if you are using special software.
It can monitor just about anything from CPU usage per core, DISK I/O, RAM, paging, the works.
I hope this helps !