Algorithm for dealing with unreliable event timestamps
Posted on 2013-02-05
I need to develop an algorithm for dealing with a timestamped event stream where the timestamps are created by the event source. The order of the received events is guaranteed to be correct (due to a file append operation), but due to potential problems with the clock at the event source, the timestamps may not be accurate and may not reflect the real order of the events.
Here's an example of an ordered event stream where one timestamp is in the wrong order.
0008 (this timestamp is wrong)
I need to process these events in the order that they arrive but somehow deal with the incorrect timestamps (for example the timestamps may be incorrect due to issues with the clock that is used to create the timestamp not being set correctly, if only temporarily).
One approach is reorder the events so that the timestamps define their ordering. However, this is wrong because I know that the order in which the timestamps are received is the correct order.
I can ignore the incorrect timestamp and use an effective timestamp that is an infinitesimally small interval after the preceding timestamp so that the effective timestamps are in the same order as the received event stream. However, this is problematic if the erroneous timestamp is a long time in the future, as all following events will have an effective timestamp that is immediately after this one until the timestamps catch up.
What would you advise?