Solved

Nagios Monitoring - Windows Services with Spaces or Characters

Posted on 2009-04-06
26
7,071 Views
Last Modified: 2012-05-06
So we are using Nagios to monitor our Windows Server environment for system resources, as well as services.  We have several servers that have 3rd party services installed that contain spaces in the service name such as "ColdFusion 8 .NET Service".  Once Nagios runs the check we get the following "Warning" Error.

ColdFusion : Unknown

2nd issue - Services with special characters such as "$" hold up the entire Nagios program and we are unable to even start Nagios when these characters are in the configuration.

Any workarounds?

Jonathan





0
Comment
Question by:EECOAdministrator
  • 12
  • 10
  • 2
  • +2
26 Comments
 
LVL 14

Expert Comment

by:Deepak Kosaraju
Comment Utility
1.
For space problem if u can copy you service and command config file we can help better.

2.
Yes! you have to make sure u have use_regexp_matching=0 set to 1 and
use_true_regexp_matching=0 to 1
0
 

Author Comment

by:EECOAdministrator
Comment Utility
Sure thing,

HEre are my cold fusion snippets

define service{
        use                     generic-service
        host_name               EECOIT02
        service_description     Service - ColdFusion 8 Application Server
        check_command           check_nt!SERVICESTATE!-d SHOWALL -l ColdFusion 8 Application Server
        }

define service{
        use                     generic-service
        host_name               EECOIT02
        service_description     Service - ColdFusion 8 .NET Service
        check_command           check_nt!SERVICESTATE!-d SHOWALL -l ColdFusion 8 .NET Service
        }

define service{
        use                     generic-service
        host_name               EECOIT02
        service_description     Service - ColdFusion 8 ODBC Agent
        check_command           check_nt!SERVICESTATE!-d SHOWALL -l Cold Fusion 8 ODBC Agent
        }

define service{
        use                     generic-service
        host_name               EECOIT02
        service_description     Service - ColdFusion 8 ODBC Server
        check_command           check_nt!SERVICESTATE!-d SHOWALL -l ColdFusion 8 ODBC Server
        }

define service{
        use                     generic-service
        host_name               EECOIT02
        service_description     Service - ColdFusion 8 Search Server
        check_command           check_nt!SERVICESTATE!-d SHOWALL -l ColdFusion 8 Search Server

and as for your comment on part 2 - where does that get set?
0
 
LVL 14

Accepted Solution

by:
Deepak Kosaraju earned 250 total points
Comment Utility
You service definitions are good, you forgot to copy the command definition for check_nt. you have to place the $ARG$ inside " "
0
 
LVL 14

Expert Comment

by:Deepak Kosaraju
Comment Utility
For 2nd question u will find those entires in nagios.cfg file in nagios server.
0
 

Author Comment

by:EECOAdministrator
Comment Utility
Ok so I problem 1 solved - problem 2 - I updated the nagios.cfg file to reflect the 2 changes you mentioned.  Re-enabled the check for my services with the $ character and received this error during pre-flight check.

Checking services...
Error: The description string for service 'Service - MSSQL$MFRAME' on host 'EECOIT02' contains one or more illegal characters.
        Checked 389 services.

Are we close?  Or perhaps I missed something?
0
 

Author Comment

by:EECOAdministrator
Comment Utility
And here is the config for that service

define service{
        use                     generic-service
        host_name               EECOIT02
        service_description     Service - MSSQL$MFRAME
        check_command           check_nt!SERVICESTATE!-d SHOWALL -l "MSSQL$MFRAME"
        }
0
 
LVL 14

Expert Comment

by:Deepak Kosaraju
Comment Utility
If you would enabled and reloaded the nagios with the new specs it should work.
I think the mistake is here
"MSSQL$MFRAME" in the service definition
use " " only inside commands definition. I havn't used "" in any of the service definitions.
And one more recommendation is do not use spaces during service definition in check_command use _ instead of space and follow same naming convention is nsclient.cfg file on windows box.
0
 
LVL 14

Expert Comment

by:Deepak Kosaraju
Comment Utility
if you can copy your commands definition for check_nt only I can help you better.
0
 

Author Comment

by:EECOAdministrator
Comment Utility
I see - well I will mention that I solved problem one with your help - but by accident.  This worked for the commands with spaces.

define service{
        use                     generic-service
        host_name               EECOIT02
        service_description     Service - ColdFusion 8 ODBC Agent
        check_command           check_nt!SERVICESTATE!-d SHOWALL -l "ColdFusion 8 ODBC Agent"
        }


But the $ Issue - I found that the pre-flight check was erroring on the $ in the service description, not that actual "MSSQL$MFRAME" entry.  However I still get the following error when Nagios tries to check the service.  The message below is from the Nagios web page - as in the service starts up but errors on check.

 Service - MSSQL MFRAME
 
 
 WARNING 04-06-2009 12:06:56 0d 0h 14m 20s 3/3 MSSQL$: Unknown  


And this is the code I currently have in the conf file

define service{
        use                     generic-service
        host_name               EECOIT02
        service_description     Service - MSSQL MFRAME
        check_command           check_nt!SERVICESTATE!-d SHOWALL -l "MSSQL$MFRAME"
        }


0
 

Author Comment

by:EECOAdministrator
Comment Utility
Oh, and It just clicked what you were asking for

# 'check_nt' command definition
define command{
        command_name    check_nt
        command_line    $USER1$/check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$ $ARG2$
        }

0
 
LVL 14

Expert Comment

by:Deepak Kosaraju
Comment Utility
You still get he error because nagios didn't reload because of $ issues in the Service Description.
0
 
LVL 14

Expert Comment

by:Deepak Kosaraju
Comment Utility
So Regular Expressions was not enabled for the old instance which was running.
0
 
LVL 14

Expert Comment

by:Deepak Kosaraju
Comment Utility
As per my knowledge and my Nagios Lead recommendation Its best not use $ sign in service description. But where as you can use regular expression when defining host-grouping and service-grouping definitions like the one I attached to this post.
Make sure your service descriptions are Easy and way nagios can deal with it ;-).
You command definition is good, make sure you have same set of naming conventions(Arguments passed from Nagios) are present on nsclient.cfg .
Nagios-Host-Check-Regexpression.pdf
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 

Author Comment

by:EECOAdministrator
Comment Utility
Ok I want to regroup and re-clarify where I am at.

Issue - 1 - Totally resolved by using quotations around the services with spaces.

Issue - 2 - After updating my nagios.cfg file to reflect the following
use_regexp_matching=1
use_true_regexp_matching=1

I was able to start Nagios with a quotes and a $ in the Service name.

However I still get an "MSSQL$: Unknown " error on the service check for the following Service (Pasted below).  I did remove the "$" from the service description and Nagios starts up with no issue - its just I am still not getting past the "$" in the actual service name.

define service{
        use                     generic-service
        host_name               EECOIT02
        service_description     Service - MSSQL MFRAME
        check_command           check_nt!SERVICESTATE!-d SHOWALL -l "MSSQL$MFRAME"
        }

So at this point nagios now starts with the $ in the service name, but the actual check fails.  Ive double checked and triple checked the service name on the machine in question to make sure I was typing it correctly and not going crazy but still no luck.

So at this point I am going to award you half of the points for helping me solve issue one - will you verify that I am not missing anything from our configs - especially from the check_NT config that I submitted at 12:14pm?  I still must be missing something.

At any rate, thanks for checking, and thank you for your help,
Jonathan


0
 
LVL 14

Expert Comment

by:Deepak Kosaraju
Comment Utility
Can you make sure on the windows machine that the command you are try to run to fetch the results for MSSQL MFRAME runs without any problem.
Following Error is from NSClient++ after executing the command on the windows box.

WARNING 04-06-2009 12:06:56 0d 0h 14m 20s 3/3 MSSQL$: Unknown  

So please cross whether you are passing correct variables to the script on windows box.
0
 
LVL 13

Expert Comment

by:WizRd-Linux
Comment Utility
This is only a memory but I'm pretty sure if you do MSSQL$$INSTANCE this works for nagios and check_nt.  Note the double dollar symbols instead of the single.
0
 
LVL 14

Expert Comment

by:Deepak Kosaraju
Comment Utility
Did you try my early post for second solution. I don't recommend using RegExp in service description but try with the below. If you don't notice any error's during validation phase you are good to go.
But if you see error's regarding service description then stop and change the service back to normal. But if you still see unknown error please cross check the naming conversion of arguments you are passing to nsclient++.
Manually execute the command on Windows and see the results if you get positive output on windows box then try running check_nt manually from Linux Box.


define service{

        use                     generic-service

        host_name               EECOIT02

        service_description     Service - MSSQL_$_$_MFRAME

        check_command           check_nt!SERVICESTATE!-d SHOWALL -l "MSSQL$MFRAME"

        }
 

Manually running check_nt from Linux Box

./check_nt -H EECOIT02 -p 12489 -v SERVICESTATE -d SHOWALL -l "MSSQL$MFRAME

Open in new window

0
 
LVL 13

Expert Comment

by:WizRd-Linux
Comment Utility
Have you even tried my suggestion?? You didn't respond to my post advising if it was or was not successful.  Requesting the question closed is poor behavior in a community environment.
0
 

Author Comment

by:EECOAdministrator
Comment Utility
Sorry Guys,
Yes WizRd-Linux I tried your suggestion with no success.

kosarajudeepak - I did try the command line entry and recieved the same error MSSQL: Unknown

At this point I am still open to suggestions but find it not to be a mission critical service in our environment and have decided to comment out the command on the linux box.

As for my "poor behavior" by closing the question, - I was attempting to  award kosarajudeepak half of the points created for this question as he solved half of the problem and leaving the other half on the table.  If I did something wrong by requesting the question closed and awarding half points then let me know - as I will not try to jilt anyone out of points.

Jonathan
0
 
LVL 14

Expert Comment

by:Deepak Kosaraju
Comment Utility
As you tried command line, its something wrong with the source you are running on the windows box. Can you cross validate the code on the windows box to make sure it is passing the correct arguments during runtime. And see in the code where you defined the Unknown exit code.
0
 

Author Comment

by:EECOAdministrator
Comment Utility
Id love to - how would I go about running that code from the Windows box?

I have other services, processes and health checks running on the server in question through Nagios.  It is just the one service with the "$" that it will not check regardless of what I have tried so far.
0
 
LVL 14

Expert Comment

by:Deepak Kosaraju
Comment Utility
If You don't mind can you copy your NSC.ini file content present on windows box.
0
 

Author Comment

by:EECOAdministrator
Comment Utility
Sure, I dont mind see attached Code
[modules]

;# NSCLIENT++ MODULES

;# A list with DLLs to load at startup.

;  You will need to enable some of these for NSClient++ to work.

; ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !

; *                                                               *

; * N O T I C E ! ! ! - Y O U   H A V E   T O   E D I T   T H I S *

; *                                                               *

; ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !

FileLogger.dll

CheckSystem.dll

CheckDisk.dll

NSClientListener.dll

NRPEListener.dll

SysTray.dll

CheckEventLog.dll

CheckHelpers.dll

;CheckWMI.dll

;

; RemoteConfiguration IS AN EXTREM EARLY IDEA SO DONT USE FOR PRODUCTION ENVIROMNEMTS!

;RemoteConfiguration.dll
 

[Settings]

;# OBFUSCATED PASSWORD

;  This is the same as the password option but here you can store the password in an obfuscated manner.

;  *NOTICE* obfuscation is *NOT* the same as encryption, someone with access to this file can still figure out the 

;  password. Its just a bit harder to do it at first glance.

;obfuscated_password=Jw0KAUUdXlAAUwASDAAB

;

;# PASSWORD

;  This is the password (-s) that is required to access NSClient remotely. If you leave this blank everyone will be able to access the daemon remotly.

;password=secret-password

;

;# ALLOWED HOST ADDRESSES

;  This is a comma-delimited list of IP address of hosts that are allowed to talk to the all daemons.

;  If leave this blank anyone can access the deamon remotly (NSClient still requires a valid password).

;  The syntax is host or ip/mask so 192.168.0.0/24 will allow anyone on that subnet access

allowed_hosts=127.0.0.1/32,192.168.16.22/24,192.168.15.24/24

;

;# USE THIS FILE

;  Use the INI file as opposed to the registry if this is 0 and the use_reg in the registry is set to 1 

;  the registry will be used instead.

use_file=1
 

[log]

;# LOG DEBUG

;  Set to 1 if you want debug message printed in the log file (debug messages are always printed to stdout when run with -test)

;debug=1

;

;# LOG FILE

;  The file to print log statements to

;file=NSC.log

;

;# LOG DATE MASK

;  The format to for the date/time part of the log entry written to file.

;date_mask=%Y-%m-%d %H:%M:%S
 
 

[NSClient]

;# ALLOWED HOST ADDRESSES

;  This is a comma-delimited list of IP address of hosts that are allowed to talk to NSClient deamon.

;  If you leave this blank the global version will be used instead.

;allowed_hosts=

;

;# NSCLIENT PORT NUMBER

;  This is the port the NSClientListener.dll will listen to.

port=12489

;

;# BIND TO ADDRESS

;  Allows you to bind server to a specific local address. This has to be a dotted ip adress not a hostname.

;  Leaving this blank will bind to all avalible IP adresses.

;bind_to_address=

;

;# SOCKET TIMEOUT

;  Timeout when reading packets on incoming sockets. If the data has not arrived withint this time we will bail out.

;socket_timeout=30
 
 

[Check System]

;# CPU BUFFER SIZE

;  Can be anything ranging from 1s (for 1 second) to 10w for 10 weeks. Notice that a larger buffer will waste memory 

;  so don't use a larger buffer then you need (ie. the longest check you do +1).

;CPUBufferSize=1h

;

;# CHECK RESOLUTION

;  The resolution to check values (currently only CPU).

;  The value is entered in 1/10:th of a second and the default is 10 (which means ones every second)

;CheckResolution=10

;

;# CHECK ALL SERVICES

;  Configure how to check services when a CheckAll is performed.

;  ...=started means services in that class *has* to be running.

;  ...=stopped means services in that class has to be stopped.

;  ...=ignored means services in this class will be ignored.

;check_all_services[SERVICE_BOOT_START]=ignored

;check_all_services[SERVICE_SYSTEM_START]=ignored

;check_all_services[SERVICE_AUTO_START]=started

;check_all_services[SERVICE_DEMAND_START]=ignored

;check_all_services[SERVICE_DISABLED]=stopped
 
 

[NRPE]

;# NRPE PORT NUMBER

;  This is the port the NRPEListener.dll will listen to.

;port=5666

;

;# COMMAND TIMEOUT

;  This specifies the maximum number of seconds that the NRPE daemon will allow plug-ins to finish executing before killing them off.

;command_timeout=60

;

;# COMMAND ARGUMENT PROCESSING

;  This option determines whether or not the NRPE daemon will allow clients to specify arguments to commands that are executed.

;allow_arguments=0

;

;# COMMAND ALLOW NASTY META CHARS

;  This option determines whether or not the NRPE daemon will allow clients to specify nasty (as in |`&><'"\[]{}) characters in arguments.

;allow_nasty_meta_chars=0

;

;# USE SSL SOCKET

;  This option controls if SSL should be used on the socket.

;use_ssl=1

;

;# BIND TO ADDRESS

;  Allows you to bind server to a specific local address. This has to be a dotted ip adress not a hostname.

;  Leaving this blank will bind to all avalible IP adresses.

; bind_to_address=

;

;# ALLOWED HOST ADDRESSES

;  This is a comma-delimited list of IP address of hosts that are allowed to talk to NRPE deamon.

;  If you leave this blank the global version will be used instead.

;allowed_hosts=

;

;# SCRIPT DIRECTORY

;  All files in this directory will become check commands.

;  *WARNING* This is undoubtedly dangerous so use with care!

;script_dir=scripts\

;

;# SOCKET TIMEOUT

;  Timeout when reading packets on incoming sockets. If the data has not arrived withint this time we will bail out.

;socket_timeout=30
 
 
 

[NRPE Handlers]

;# COMMAND DEFINITIONS

;# Command definitions that this daemon will run.

;# Can be either NRPE syntax:

;command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10

;# Or simplified syntax:

;test=c:\test.bat foo $ARG1$ bar

;check_disk1=/usr/local/nagios/libexec/check_disk -w 5 -c 10

;# Or even loopback (inject) syntax (to run internal commands)

;# This is a way to run "NSClient" commands and other internal module commands such as check eventlog etc.

;check_cpu=inject checkCPU warn=80 crit=90 5 10 15

;check_eventlog=inject CheckEventLog Application warn.require.eventType=error warn.require.eventType=warning critical.require.eventType=error critical.exclude.eventType=info truncate=1024 descriptions

;check_disk_c=inject CheckFileSize ShowAll MaxWarn=1024M MaxCrit=4096M File:WIN=c:\ATI\*.*

;# But be careful:

; dont_check=inject dont_check This will "loop forever" so be careful with the inject command...

;# Check some escapings...

; check_escape=inject CheckFileSize ShowAll MaxWarn=1024M MaxCrit=4096M "File: foo \" WIN=c:\\WINDOWS\\*.*"

;# Some real world samples

;nrpe_cpu=inject checkCPU warn=80 crit=90 5 10 15

;nrpe_ok=scripts\ok.bat

;check_multi_line=scripts\multi_line.bat

;#

;# The sample scripts

;#

;check_long=scripts\long.bat

;check_ok=scripts\ok.bat

;check_nok=scripts\xlong.bat

;check_vbs=cscript.exe //T:30 //NoLogo scripts\check_vb.vbs
 
 

; [includes]

;# The order when used is "reversed" thus the last included file will be "first"

;# Included files can include other files (be carefull only do basic recursive checking)

;

; myotherfile.ini

; real.ini

Open in new window

0
 

Author Closing Comment

by:EECOAdministrator
Comment Utility
Part one of my question was solved by placing quotes around the Service name for services containing spaces.

Part two of my question was never resolved.  I did try all of the suggestions with no success and will award kosarajudeepak the points originally offered with this question.

Thanks for your help and suggestions.

Jonathan

0
 

Expert Comment

by:lonza01
Comment Utility
Hi,

I ran into this same issue - again with a SQL Server instance with a "$" character in the service name.  After much head-scratching I was able to resolve it as follows.

I have a command definition "check_nt_service":

define command {
                command_name                          check_nt_service
                command_line                          $USER1$/check_nt -H $HOSTADDRESS$ -p 12489 -s SecretPassword -v SERVICESTATE -l $ARG1$
}

Then for the SQL Server Host, I have the following service definition:

define service {
                service_description                   check_nt_service
                check_command                         check_nt_service!MSSQL\\$$SHAREPOINT
; etc. etc. etc.
;
}

Hope this helps?
0
 

Expert Comment

by:Honaco
Comment Utility
Hi,
It will work like this:

check_command                 check_nt!SERVICESTATE! -d SHOWALL -l "MSSQL\\$$SHAREPOINT"

I'am using it and it works
Thank you lonza01!

0

Featured Post

Control application downtime with dependency maps

Visualize the interdependencies between application components better with Applications Manager's automated application discovery and dependency mapping feature. Resolve performance issues faster by quickly isolating problematic components.

Join & Write a Comment

Linux users are sometimes dumbfounded by the severe lack of documentation on a topic. Sometimes, the documentation is copious, but other times, you end up with some obscure "it varies depending on your distribution" over and over when searching for …
Managing 24/7 IT Operations is a hands-on job and indeed a difficult one. Over the years I have found some simple tips and techniques to increase the efficiency of the overall operations. The core concept has always been on continuous improvement; a…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now