Solved

RDDTool Spikes when using counters

Posted on 2008-10-28
2
392 Views
Last Modified: 2012-05-05
I use RRDTool with nagios/nagios graph to collect performance data over time. I'm having a problem in that when one of the devices reboots the data gets a spike.

I'm using the counter type in RRDtool and so it subtracts the current value from the last value to figure out how many events have occured since the last time the counter was queried. The problem comes when the router/computer reboots and you have a value of 0 and it compares it to the last time. For instance I have it monitoring users per sec on our web server and when the server reboots for whatever reason it will show a spike of 6M users/sec. This makes the graph go all haywire and it becomes useless.

Is there a way to set rrdtool to ignore values that are less then the previous value? A counter will only go up so this seems like an easy way to fix the issue. I think that MRTG does this by default when dealing counters.

Two things I've read about that I don't want to implement.
Cleaning the values.
I know how to clean the values out via script or via a rrdtool dump and then taking them out by hand and reimporting it, but as you can imagine this isn't cool.
Setting upper limits.
You can have RRDTool disregard values that are out of a specific range, but the problem is that I'd have to set up specific limits on each and to be honest on some of them I don't know what would be reasonable limits. I'm afraid I'd lose useful data if it actually go over my limits.

So anyone know how to set this to ignore entries smaller then the last one?

0
Comment
Question by:pubdyn
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 27

Accepted Solution

by:
Nopius earned 500 total points
ID: 22918970
> Is there a way to set rrdtool to ignore values that are less then the previous value?

No. This behavior is correct. Many counters are overflowed even once a day.

>  I think that MRTG does this by default when dealing counters.

No, it is not. It has exactly the same problem (or feature). I dealt with cases when network counters really go through 0 and this behavior is correct there.

> Cleaning the values.
> I know how to clean the values out via script or via a rrdtool dump and then
> taking them out by hand and reimporting it, but as you can imagine this isn't cool

It is quite reasonable, why it should be cool :-) ?


> You can have RRDTool disregard values that are out of a specific range, but the problem is that I'd have to set up specific limits on each and to be honest on some of them I don't know what would be reasonable limits. I'm afraid I'd lose useful data if it actually go over my limits.

This also may be used. If you have 64bit counter, you count users/sec connection rate and a real maximum of 100 users/sec, then setup cutoff value for your counter is:
18446744073709547615
that is MAX(64bit)-4*100

or if you use 32 bit counter, this value is 4294966895 = MAX(32bit)-4*100.



0

Featured Post

Create the perfect environment for any meeting

You might have a modern environment with all sorts of high-tech equipment, but what makes it worthwhile is how you seamlessly bring together the presentation with audio, video and lighting. The ATEN Control System provides integrated control and system automation.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
preview video of network plus 2 151
WireShark and packet capture unsecure sites on network 4 124
gns3 with layer 3 switch 6 172
Looking for a program called HoneyMine. 3 118
Many network operators, engineers, and administrators do not take several factors into consideration when troubleshooting network throughput and latency issues.  They often  measure the throughput by performing a measurement  by transferring a large…
Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…
In this tutorial you'll learn about bandwidth monitoring with flows and packet sniffing with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're interested in additional methods for monitoring bandwidt…

739 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question