Link to home
Start Free TrialLog in
Avatar of td_miles
td_miles

asked on

nagios historical data

Hi,

I'm using nagios to monitor some hosts. At this stage it is very basic, just checking whether the hosts is responding (not checking services yet).

The problem I'm having is that when I do a availability report, it only gives me the current days worth of info. I've looked through the config files, but can't find anywhere that specifies this.

I previously had log_rotation_method=w (rotate logs weekly). I changed this to no rotation, but with no change ?

All I want to do is retain historical information ?

I'm using nagios 1.0, as that is what the machine had installed on it. Given the speed of the machine, I don't really want to have to compile 1.2, but if that will fix the problem, then I'll probably have to start the compile late one night, let it go and hope that it finishes by morning.

(500 points as I want to get it sorted asap)
Avatar of wesly_chen
wesly_chen
Flag of United States of America image

Hi,

In the config file
Do you have the following line:
log_archive_path=/var/log/nagios/archives   <== path may vary
That's the historial log files.

Regards,

Wesly
Avatar of td_miles
td_miles

ASKER

yes, I have:

log_archive_path=/usr/local/nagios/var/archives

If I look in that location, there is a file for each day, EG: "nagios-11-24-2004-00.log" that is of size 33 bytes and only has the single line in it:

[1101218400] LOG ROTATION: DAILY

Which seems strange, as I never had log rotation set to daily (as I said above, it was orginally weekly and then I tried none).

So it looks like nagios is rotating the logs daily, but not actually putting anything in the archive logs (which explains why I can't get any historical data). What would be the reason for this and how can I fix ?
Hi,

  Could you post your config file here?

Wesly
the whole nagios.cfg file ?
Or do
grep -i log nagios.cfg
(the whole cfg file will be better)

Wesly
To make it a little smaller, I did a "grep -v \# nagios.cfg" to get rid of all the comments, which make up the bulk of the file. You get all of the parameters, just no comments.



==================

log_file=/usr/local/nagios/var/nagios.log

cfg_file=/usr/local/nagios/etc/checkcommands.cfg

cfg_file=/usr/local/nagios/etc/misccommands.cfg

cfg_file=/usr/local/nagios/etc/contactgroups.cfg
cfg_file=/usr/local/nagios/etc/contacts.cfg
cfg_file=/usr/local/nagios/etc/dependencies.cfg
cfg_file=/usr/local/nagios/etc/escalations.cfg
cfg_file=/usr/local/nagios/etc/hostgroups.cfg
cfg_file=/usr/local/nagios/etc/hosts.cfg
cfg_file=/usr/local/nagios/etc/services.cfg
cfg_file=/usr/local/nagios/etc/timeperiods.cfg

resource_file=/usr/local/nagios/etc/resource.cfg

status_file=/usr/local/nagios/var/status.log

nagios_user=nagios

nagios_group=nagios

check_external_commands=1

command_check_interval=5s

command_file=/usr/local/nagios/var/rw/nagios.cmd

comment_file=/usr/local/nagios/var/comment.log

downtime_file=/usr/local/nagios/var/downtime.log

lock_file=/usr/local/nagios/var/nagios.lock

temp_file=/usr/local/nagios/var/nagios.tmp

log_rotation_method=n

log_archive_path=/usr/local/nagios/var/archives

use_syslog=1

log_notifications=1

log_service_retries=1

log_host_retries=1

log_event_handlers=1

log_initial_states=0

log_external_commands=1

log_passive_service_checks=1

inter_check_delay_method=15

service_interleave_factor=s

max_concurrent_checks=0

service_reaper_frequency=10

sleep_time=1

service_check_timeout=60
host_check_timeout=30
event_handler_timeout=30
notification_timeout=30
ocsp_timeout=5
perfdata_timeout=5

retain_state_information=1

state_retention_file=/usr/local/nagios/var/status.sav

retention_update_interval=60

use_retained_program_state=1

interval_length=60

use_agressive_host_checking=0

execute_service_checks=1

accept_passive_service_checks=1

enable_notifications=1

enable_event_handlers=1

process_performance_data=0

obsess_over_services=0

check_for_orphaned_services=0

check_service_freshness=1

freshness_check_interval=60

aggregate_status_updates=1

status_update_interval=15

enable_flap_detection=0

low_service_flap_threshold=5.0
high_service_flap_threshold=20.0
low_host_flap_threshold=5.0
high_host_flap_threshold=20.0

date_format=euro

illegal_object_name_chars=`~!$%^&*|'"<>?,()=

illegal_macro_output_chars=`~$&|'"<>

admin_email=root

admin_pager=pagenagios

> use_syslog=1
So it should also log to syslog (/var/log/messages). Do you see any nagios message in the /var/log/messages.X (X could be 0,1,2.3...)?

The nagios.cfg file looks ok to me. Unless you have another program in /usr/local/nagios/var/rw/nagios.cmd or cron job move the
log file to somewhere.

Wesly
plenty of nagios notifications are going into /var/log/messages

there is nothing in the file /usr/local/nagios/var/rw/nagios.cmd

nothing related to nagios in cron

Looks like I will have to suffer the pain of compiling a new version of Nagios in the hope it will resolve the issue ?
> use_syslog=1
Set to
use_syslog=0
log_rotation_method=d

And observe for a couple days to see if there are the rotated logs in /usr/local/nagios/var/archives.

If not, then it's a bug (weird ). Go ahead to compile the latest version.

Wesly
ok, have changed those two parameters, restarted nagios service. will wait and see (expect next post after weekend)
still no archived logs, so I recompiled, still no logs, so I deleted the archives directory and recreated it, still no logs ?

I'm at a bit of a loss, I have latest version (now) and it's still not working, maybe I should junk my config file and start again from scratch, in case there is something wrong in it that isn't obvious ?

I should mention, the logs are being rotated into the archives dir, but they still only contain a single line at the top of the file with log rotation date.
To try a few more options, I decided to change the rotation interval to hourly and IT WORKS ! So I'm getting hourly log rotations (with data in them). I should try setting it back to daily (or weekly) to see if that will work now, but I'm almost not game.
>  interval to hourly and IT WORKS
Could that be too much information (log file too big) and the rotation fail?

Wesly
Don't think so, the hourly rotations are 15k files (MAX). I don't have any other installations of Nagios to compare to, but I wouldn't have thought that 15k is a big log file. Apache easily rotates log files in the 100MB+ range.
The reason for not wanting to keep doing hourly archives is:

1. I'll end up with a bucket load of files
2. When I do a report on historical data it will take longer as it needs to open/close 24 times as many files as it would if they were daily log files.
something else that is strange (that I just noticed) is that it still appears to be failing (deleting all the content) for the midnight rotation.

-rw-rw-r--    1 nagios   nagios       2.6k Nov 29 22:58 nagios-11-29-2004-23.log
-rw-rw-r--    1 nagios   nagios         34 Nov 30 00:00 nagios-11-30-2004-00.log
-rw-rw-r--    1 nagios   nagios       2.5k Nov 30 00:59 nagios-11-30-2004-01.log

yuo can see that either side (23 & 01) is a size of a few k, whereas the mignight (00) one is 34 bytes in size. It contains the same one line header that I have posted previously:

[1101736800] LOG ROTATION: HOURLY
ASKER CERTIFIED SOLUTION
Avatar of wesly_chen
wesly_chen
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
My apologies for the delay in getting back to this Q.

in /var/log/messages

Jan  4 00:00:00 wonderwoman nagios: LOG ROTATION: DAILY
Jan  4 00:00:00 wonderwoman nagios: LOG ROTATION: DAILY

(wonderwoman is the machines name)

What is strange is that on the hourly rotations, it isn't logging anything, yet on this one (daily at midnight) it logs it twice. This could mean it is actually rotating the logs twice (don't know why it would do this, but it might) which would explain why there is no info in the logs, the first time it rotates them, it has the data, then if it rotates the new (empty) log over the top, there goes the data.

Don't worry about trying to work out why it's doing this, I'm not going to. The hourly logs (minus the hour from 23:00 - 00:00) will suffice for what I'm doing.

Thanks.