Solved

Best practices for a NMS solution

Posted on 2009-04-01
5
1,982 Views
Last Modified: 2012-06-27
I've been using SolarWinds ipMonitor 8.5 for a while now to remotely monitor some servers via SNMP however the system isn't very efficient and doesn't give me what I'm looking for in some cases. I am going rolling out a new instance of ipMonitor 9.x shortly and am currently tinkering with SolarWinds Orion NPM 9.  See below for additional details, and eventually my question =)

I have ~35 clients that I manage. An average client has between 3 and 5 servers. We currently have one-way VPN connections to all clients so we can remote in for work. We could setup rules for two-way communication if required.

All clients are running HP & Dell systems. All systems have their Vendor Management utilities installed. Windows servers across the board.  About 20% of our servers are running VMware ESX stand-alone servers.

I'm looking for monitoring that will alert me when any physical hardware has issues (ie. dead physical disk or predictive failure, etc.) in addition to windows/software alerting (ie. partitions running low on space, services not running, etc.).

Not to point out the obvious, but the key is to acheive a reliable system in the most network efficient manner possible. I assume this would involve setting up some SNMP traps instead of polling every server every 5 minutes but I'm not sure how exactly. I'm completely open to any and all ideas. Any references and or documentation are much appreciated as well.  Ultimately if someone could draft a model for me that says for every client you should do X, Y, Z and then A, B, C from my end to get this up and running, it would be great.  Thanks in advance.
0
Comment
Question by:Malevolo
5 Comments
 
LVL 16

Expert Comment

by:SteveJ
ID: 24051731
Take a look at zenoss. I think it has most of what you are looking for. Since you are running windows boxes, I'd look at some of the free utilities that convert your event logs into either syslog or snmp traps. I'd also look at some of the low cost Syslog message parsers that can take a flat log file and send messages in Syslog format . . . which you can then parse and create alarms for using MySQL or some other inexpensive Syslog-to-SNMP trap facility.

There's a lot of work to do to put a complete system together like you are looking for. I have done this and I charged the clients lots of money to a) set it up and b) keep it running and up to date. So I charge an upfront fee and a monthly recurring fee.

I have not found anything that does every thing I want it to do. Zenoss (OpenNMS is also free, but the SNMP trap integration is a little chaotic) is free and beats the heck out of paying for HP OpenView or HP OpenView Operations. HP OVO is a fairly good product and can let you know if certain process aren't running and can restart them if you want . . . but it is pricey and requires a LOT of attention. I used it when I worked for GTE/Verizon and the Oracle and Web Logic integration eventually defeated me and so I wrote my own Java apps that did precisely what I wanted.

One of the attractive features about Zenoss (most NMS's have this feature but it is very simple in Zenoss) is the ability to monitor scads of devices but only generate alerts on selected devices and then only on things that I really care about being paged in the middle of the night.
0
 
LVL 32

Expert Comment

by:Kamran Arshad
ID: 24058528
Hi,

I can recommend you a few options for service-based NMS;

Propriety-NMS            

SolarWinds Orion      www.solarwinds.com      Propriety
Smarts      www.smarts.com      Propriety
WhatsupGold      www.whatsupgold.com      Propriety
EM7      www.sciencelogic.com      Propriety
CA      ca.com      Propriety
ServerAlive      www.woodstone.nu      Propriety
Observer      www.netinst.com      Propriety

Service Monitors            

Hound-Dog      www.hounddogiseasy.com      Propriety
Level Platform      www.levelplatforms.com      Propriety
Kaseya      www.kaseya.com      Propriety
N-Able      www.n-able.com      Propriety

Open-Source NMS            

ZenOSS      www.zenoss.com      LAMP based NMS
Nagios      www.nagios.org      LAMP based NMS
JFFNMS      www.jffnms.org      LAMP based NMS
OpenNMS      www.opennms.org      LAMP based NMS
Zabbix      www.zabbix.com      LAMP based NMS
Hyperic HQ      www.hyperic.com      LAMP based NMS
GroundWork      www.groundworkopensource.com      LAMP based NMS

0
 
LVL 1

Expert Comment

by:rootcoolk
ID: 24078513
Orion APM application monitoring you need to know whats happening with your applications.

for Orion APM go too http://www.solarwinds.com/products/orion/application_monitor/
for demo go too         http://oriondemo.solarwinds.com/Orion/Apm/Summary.aspx
0
 

Author Comment

by:Malevolo
ID: 24108771
Thanks all for the information. I am definitely moving forward with ipMonitor 9 and Orion NPM (with the APM module).  My question, which admittedly was all over the place, was along the terms of how should I structure this for a sound monitoring platform?

How should I handle SNMP Traps, should I integrate Syslog servers? For example, to simplify, if I had only only two clients with 5 servers each, what would be the best way to monitor the servers at the application level (windows critical events in event log) and at the physical level (physical disk with predictive failur or a raid array in degraded status). Do I set up on Syslog server for each client and have it collect information from all machines, and then have that syslog server report to my Orion NPM or do I need to install the syslog server on every server and have them report to Orion NPM? How about for SNMP traps? Do I have every server there set to report to me directly, or should I have them all report to one server there and then have that one report to my Orion NPM once thresholds or limits are hit?  Thanks.
0
 
LVL 16

Accepted Solution

by:
SteveJ earned 500 total points
ID: 24108940
"Do I set up on Syslog server for each client and have it collect information from all machines, and then have that syslog server report to my Orion NPM or do I need to install the syslog server on every server and have them report to Orion NPM?"

I think the reason you would have individual syslog servers is simply for backup. On *nix systems you can use syslog-ng where there's considerable flexibility in reporting, e.g. keep all syslog events locally and forward "selected" events to a remote syslog master. I suspect there's a similar product for windows . . .

"How about for SNMP traps? Do I have every server there set to report to me directly, or should I have them all report to one server there and then have that one report to my Orion NPM once thresholds or limits are hit?  Thanks."

I think that's a similar issue . . . if all my clients are in the same building, perhaps I'd have a single SNMP trap manager. But if they were spread geographically, I would want to make sure I had local SNMP managers that would simply forward a subset of traps to a master SNMP manager. That said, I don't think I would limit sending traps between sites based on a threshold. I would keep all traps locally and forward those that might require immediate attention . . . which may sound like more or less the same thing. I am thinking in terms of trap severity as opposed to trap volume.

Good luck,
Steve
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

This article is in response to a question (http://www.experts-exchange.com/Networking/Network_Management/Network_Analysis/Q_28230497.html) here at Experts Exchange. The Original Poster (OP) requires a utility that will accept a list of IP addresses …
Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…
This video gives you a great overview about bandwidth monitoring with SNMP and WMI with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're looking for how to monitor bandwidth using netflow or packet s…

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now