I use RRDTool with nagios/nagios graph to collect performance data over time. I'm having a problem in that when one of the devices reboots the data gets a spike.
I'm using the counter type in RRDtool and so it subtracts the current value from the last value to figure out how many events have occured since the last time the counter was queried. The problem comes when the router/computer reboots and you have a value of 0 and it compares it to the last time. For instance I have it monitoring users per sec on our web server and when the server reboots for whatever reason it will show a spike of 6M users/sec. This makes the graph go all haywire and it becomes useless.
Is there a way to set rrdtool to ignore values that are less then the previous value? A counter will only go up so this seems like an easy way to fix the issue. I think that MRTG does this by default when dealing counters.
Two things I've read about that I don't want to implement.
Cleaning the values.
I know how to clean the values out via script or via a rrdtool dump and then taking them out by hand and reimporting it, but as you can imagine this isn't cool.
Setting upper limits.
You can have RRDTool disregard values that are out of a specific range, but the problem is that I'd have to set up specific limits on each and to be honest on some of them I don't know what would be reasonable limits. I'm afraid I'd lose useful data if it actually go over my limits.
So anyone know how to set this to ignore entries smaller then the last one?