Link to home
Start Free TrialLog in
Avatar of Malevolo
Malevolo

asked on

Best practices for a NMS solution

I've been using SolarWinds ipMonitor 8.5 for a while now to remotely monitor some servers via SNMP however the system isn't very efficient and doesn't give me what I'm looking for in some cases. I am going rolling out a new instance of ipMonitor 9.x shortly and am currently tinkering with SolarWinds Orion NPM 9.  See below for additional details, and eventually my question =)

I have ~35 clients that I manage. An average client has between 3 and 5 servers. We currently have one-way VPN connections to all clients so we can remote in for work. We could setup rules for two-way communication if required.

All clients are running HP & Dell systems. All systems have their Vendor Management utilities installed. Windows servers across the board.  About 20% of our servers are running VMware ESX stand-alone servers.

I'm looking for monitoring that will alert me when any physical hardware has issues (ie. dead physical disk or predictive failure, etc.) in addition to windows/software alerting (ie. partitions running low on space, services not running, etc.).

Not to point out the obvious, but the key is to acheive a reliable system in the most network efficient manner possible. I assume this would involve setting up some SNMP traps instead of polling every server every 5 minutes but I'm not sure how exactly. I'm completely open to any and all ideas. Any references and or documentation are much appreciated as well.  Ultimately if someone could draft a model for me that says for every client you should do X, Y, Z and then A, B, C from my end to get this up and running, it would be great.  Thanks in advance.
Avatar of Steve Jennings
Steve Jennings

Take a look at zenoss. I think it has most of what you are looking for. Since you are running windows boxes, I'd look at some of the free utilities that convert your event logs into either syslog or snmp traps. I'd also look at some of the low cost Syslog message parsers that can take a flat log file and send messages in Syslog format . . . which you can then parse and create alarms for using MySQL or some other inexpensive Syslog-to-SNMP trap facility.

There's a lot of work to do to put a complete system together like you are looking for. I have done this and I charged the clients lots of money to a) set it up and b) keep it running and up to date. So I charge an upfront fee and a monthly recurring fee.

I have not found anything that does every thing I want it to do. Zenoss (OpenNMS is also free, but the SNMP trap integration is a little chaotic) is free and beats the heck out of paying for HP OpenView or HP OpenView Operations. HP OVO is a fairly good product and can let you know if certain process aren't running and can restart them if you want . . . but it is pricey and requires a LOT of attention. I used it when I worked for GTE/Verizon and the Oracle and Web Logic integration eventually defeated me and so I wrote my own Java apps that did precisely what I wanted.

One of the attractive features about Zenoss (most NMS's have this feature but it is very simple in Zenoss) is the ability to monitor scads of devices but only generate alerts on selected devices and then only on things that I really care about being paged in the middle of the night.
Hi,

I can recommend you a few options for service-based NMS;

Propriety-NMS            

SolarWinds Orion      www.solarwinds.com      Propriety
Smarts      www.smarts.com      Propriety
WhatsupGold      www.whatsupgold.com      Propriety
EM7      www.sciencelogic.com      Propriety
CA      ca.com      Propriety
ServerAlive      www.woodstone.nu      Propriety
Observer      www.netinst.com      Propriety

Service Monitors            

Hound-Dog      www.hounddogiseasy.com      Propriety
Level Platform      www.levelplatforms.com      Propriety
Kaseya      www.kaseya.com      Propriety
N-Able      www.n-able.com      Propriety

Open-Source NMS            

ZenOSS      www.zenoss.com      LAMP based NMS
Nagios      www.nagios.org      LAMP based NMS
JFFNMS      www.jffnms.org      LAMP based NMS
OpenNMS      www.opennms.org      LAMP based NMS
Zabbix      www.zabbix.com      LAMP based NMS
Hyperic HQ      www.hyperic.com      LAMP based NMS
GroundWork      www.groundworkopensource.com      LAMP based NMS

Orion APM application monitoring you need to know whats happening with your applications.

for Orion APM go too http://www.solarwinds.com/products/orion/application_monitor/
for demo go too         http://oriondemo.solarwinds.com/Orion/Apm/Summary.aspx
Avatar of Malevolo

ASKER

Thanks all for the information. I am definitely moving forward with ipMonitor 9 and Orion NPM (with the APM module).  My question, which admittedly was all over the place, was along the terms of how should I structure this for a sound monitoring platform?

How should I handle SNMP Traps, should I integrate Syslog servers? For example, to simplify, if I had only only two clients with 5 servers each, what would be the best way to monitor the servers at the application level (windows critical events in event log) and at the physical level (physical disk with predictive failur or a raid array in degraded status). Do I set up on Syslog server for each client and have it collect information from all machines, and then have that syslog server report to my Orion NPM or do I need to install the syslog server on every server and have them report to Orion NPM? How about for SNMP traps? Do I have every server there set to report to me directly, or should I have them all report to one server there and then have that one report to my Orion NPM once thresholds or limits are hit?  Thanks.
ASKER CERTIFIED SOLUTION
Avatar of Steve Jennings
Steve Jennings

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial