Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Best practices for a NMS solution

Posted on 2009-04-01
5
Medium Priority
?
2,098 Views
Last Modified: 2012-06-27
I've been using SolarWinds ipMonitor 8.5 for a while now to remotely monitor some servers via SNMP however the system isn't very efficient and doesn't give me what I'm looking for in some cases. I am going rolling out a new instance of ipMonitor 9.x shortly and am currently tinkering with SolarWinds Orion NPM 9.  See below for additional details, and eventually my question =)

I have ~35 clients that I manage. An average client has between 3 and 5 servers. We currently have one-way VPN connections to all clients so we can remote in for work. We could setup rules for two-way communication if required.

All clients are running HP & Dell systems. All systems have their Vendor Management utilities installed. Windows servers across the board.  About 20% of our servers are running VMware ESX stand-alone servers.

I'm looking for monitoring that will alert me when any physical hardware has issues (ie. dead physical disk or predictive failure, etc.) in addition to windows/software alerting (ie. partitions running low on space, services not running, etc.).

Not to point out the obvious, but the key is to acheive a reliable system in the most network efficient manner possible. I assume this would involve setting up some SNMP traps instead of polling every server every 5 minutes but I'm not sure how exactly. I'm completely open to any and all ideas. Any references and or documentation are much appreciated as well.  Ultimately if someone could draft a model for me that says for every client you should do X, Y, Z and then A, B, C from my end to get this up and running, it would be great.  Thanks in advance.
0
Comment
Question by:Malevolo
5 Comments
 
LVL 16

Expert Comment

by:SteveJ
ID: 24051731
Take a look at zenoss. I think it has most of what you are looking for. Since you are running windows boxes, I'd look at some of the free utilities that convert your event logs into either syslog or snmp traps. I'd also look at some of the low cost Syslog message parsers that can take a flat log file and send messages in Syslog format . . . which you can then parse and create alarms for using MySQL or some other inexpensive Syslog-to-SNMP trap facility.

There's a lot of work to do to put a complete system together like you are looking for. I have done this and I charged the clients lots of money to a) set it up and b) keep it running and up to date. So I charge an upfront fee and a monthly recurring fee.

I have not found anything that does every thing I want it to do. Zenoss (OpenNMS is also free, but the SNMP trap integration is a little chaotic) is free and beats the heck out of paying for HP OpenView or HP OpenView Operations. HP OVO is a fairly good product and can let you know if certain process aren't running and can restart them if you want . . . but it is pricey and requires a LOT of attention. I used it when I worked for GTE/Verizon and the Oracle and Web Logic integration eventually defeated me and so I wrote my own Java apps that did precisely what I wanted.

One of the attractive features about Zenoss (most NMS's have this feature but it is very simple in Zenoss) is the ability to monitor scads of devices but only generate alerts on selected devices and then only on things that I really care about being paged in the middle of the night.
0
 
LVL 32

Expert Comment

by:Kamran Arshad
ID: 24058528
Hi,

I can recommend you a few options for service-based NMS;

Propriety-NMS            

SolarWinds Orion      www.solarwinds.com      Propriety
Smarts      www.smarts.com      Propriety
WhatsupGold      www.whatsupgold.com      Propriety
EM7      www.sciencelogic.com      Propriety
CA      ca.com      Propriety
ServerAlive      www.woodstone.nu      Propriety
Observer      www.netinst.com      Propriety

Service Monitors            

Hound-Dog      www.hounddogiseasy.com      Propriety
Level Platform      www.levelplatforms.com      Propriety
Kaseya      www.kaseya.com      Propriety
N-Able      www.n-able.com      Propriety

Open-Source NMS            

ZenOSS      www.zenoss.com      LAMP based NMS
Nagios      www.nagios.org      LAMP based NMS
JFFNMS      www.jffnms.org      LAMP based NMS
OpenNMS      www.opennms.org      LAMP based NMS
Zabbix      www.zabbix.com      LAMP based NMS
Hyperic HQ      www.hyperic.com      LAMP based NMS
GroundWork      www.groundworkopensource.com      LAMP based NMS

0
 
LVL 1

Expert Comment

by:rootcoolk
ID: 24078513
Orion APM application monitoring you need to know whats happening with your applications.

for Orion APM go too http://www.solarwinds.com/products/orion/application_monitor/
for demo go too         http://oriondemo.solarwinds.com/Orion/Apm/Summary.aspx
0
 

Author Comment

by:Malevolo
ID: 24108771
Thanks all for the information. I am definitely moving forward with ipMonitor 9 and Orion NPM (with the APM module).  My question, which admittedly was all over the place, was along the terms of how should I structure this for a sound monitoring platform?

How should I handle SNMP Traps, should I integrate Syslog servers? For example, to simplify, if I had only only two clients with 5 servers each, what would be the best way to monitor the servers at the application level (windows critical events in event log) and at the physical level (physical disk with predictive failur or a raid array in degraded status). Do I set up on Syslog server for each client and have it collect information from all machines, and then have that syslog server report to my Orion NPM or do I need to install the syslog server on every server and have them report to Orion NPM? How about for SNMP traps? Do I have every server there set to report to me directly, or should I have them all report to one server there and then have that one report to my Orion NPM once thresholds or limits are hit?  Thanks.
0
 
LVL 16

Accepted Solution

by:
SteveJ earned 1500 total points
ID: 24108940
"Do I set up on Syslog server for each client and have it collect information from all machines, and then have that syslog server report to my Orion NPM or do I need to install the syslog server on every server and have them report to Orion NPM?"

I think the reason you would have individual syslog servers is simply for backup. On *nix systems you can use syslog-ng where there's considerable flexibility in reporting, e.g. keep all syslog events locally and forward "selected" events to a remote syslog master. I suspect there's a similar product for windows . . .

"How about for SNMP traps? Do I have every server there set to report to me directly, or should I have them all report to one server there and then have that one report to my Orion NPM once thresholds or limits are hit?  Thanks."

I think that's a similar issue . . . if all my clients are in the same building, perhaps I'd have a single SNMP trap manager. But if they were spread geographically, I would want to make sure I had local SNMP managers that would simply forward a subset of traps to a master SNMP manager. That said, I don't think I would limit sending traps between sites based on a threshold. I would keep all traps locally and forward those that might require immediate attention . . . which may sound like more or less the same thing. I am thinking in terms of trap severity as opposed to trap volume.

Good luck,
Steve
0

Featured Post

What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction Many times we come across a slowness or instability between two hosts, and almost always we blame the poor networking guys, just because they're an easy target.  Sometimes we forget that other factors including disk bottlenecks, CPU …
Quality of Service (QoS) options are nearly endless when it comes to networks today. This article is merely one example of how it can be handled in a hub-n-spoke design using a 3-tier configuration.
NetCrunch network monitor is a highly extensive platform for network monitoring and alert generation. In this video you'll see a live demo of NetCrunch with most notable features explained in a walk-through manner. You'll also get to know the philos…
Monitoring a network: why having a policy is the best policy? Michael Kulchisky, MCSE, MCSA, MCP, VTSP, VSP, CCSP outlines the enormous benefits of having a policy-based approach when monitoring medium and large networks. Software utilized in this v…
Suggested Courses

876 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question