Link to home
Start Free TrialLog in
Avatar of JohnSantaFe
JohnSantaFe

asked on

What type of checksum to use

I'm sending various sized packets from 16 Bytes to  8KBytes over a short range 24 inches (fairly reliable) Bluetooth link.  I need to get rid of packets that contain any errors so I'm planning on a checksum.  To keep processing simple I'd like to use as simple and small a checksum as possible.

Any suggestions?

Thanks.
Avatar of jkr
jkr
Flag of Germany image

Avatar of HooKooDooKu
HooKooDooKu

The simplest and smallest checksum would be to simply add up the values of all the bytes in an unsigned 8 byte variable and append that 8 byte variable to the end of the data.  

Of course the smallest checksum would be a simple bit (even or odd parity).  But it would be more difficult to transmit a single bit.
Avatar of JohnSantaFe

ASKER

Also, I was hoping for why you would use a given solution.  I like the simple 8 bit approach, but it that adequate to cover 8 KB?  Is there any math that might help choose?
Thanks.
Assuming that errors just randomly scramble the packet (not a good assumption), then using 8 bits gives you a 255/256 chance (99.6%) of detecting each error. If you have a 10% error rate and send 1000 packets, then there's a (255/256)^(1000*.1) = 67.6% chance you will catch them all. That's not so good. Of course, you'll catch 100% of the single bit errors.
Another question to ask yourself, is whether you want to simply discard corrupted packets, or want to attempt to fix the corruption ?
Also, whether false positives are acceptable or not (you can never really avoid them, but you can minimize the probability) ?
By false positive, you mean when an error falls in the checksum?
>> By false positive, you mean when an error falls in the checksum?

Yes. Which will discard a valid message, and can let a corrupt message go through.
You said you wanted something "simple" and "small".  Nothing is going to be simpler or smaller than the simple check-sum which is exactly what I've described.  But as TommySzalapski points out, it isn't perfect.  

Now you could improve your odds by using a 16 bit or 32 bit check-sum.  But the fundimental flaw of the check-sum is that it will fail to catch a pair of offsetting errors.  For example, if bit 7 of the 1st byte in the packet was supposed to be a 1, but it was a zero, the check-sum will catch that there is something wrong... unless there is a second error somewhere else in the packet where bit 7 was supposed to be a 0, but it was instead a 1.  So check-sums are simple, but have this basic flaw.

Now things like CRC are more complex and are sort of like a hashing function.  The basic idea behind hashing functions is that a small change in the input makes a huge difference in the output.  The result is that a pair of matching errors won't cancel each other out and therefore produce a much more reliable form of error detection.

As a metaphorical example, let's say we are generating error detection for the following sentance.  At the end of each sentence is a pair of "check digit".  The 1st is the "check digit" produced by a check-sum, the 2nd is the "check digit" produces by CRC (or a hash function).

You want to send:
  The brown fox jumped over the lazy dog.   "G,  "G"
  The crown fox jumped over the lazy dog.   "H", "X"
  The brown fox humped over the lazy dog.  "F", "A"
  The brown fox jumped over the lazy god.   "G", "L"

For the check-sum, you can see that a single minor change in the text produces a minor change in the check digit (one letter off in the alphabet).  The result is that a pair of minor changes results in the same check digit and you miss the error.

For the CRC (hash) function, you can see that a minor change in the text produces a major change in the check digit.  The result is that a pair of minor changes results in two major changes in the math and therefore results in a much less likelyhood that a pair of errors would "offset" each other.


So you have a basic engineering decision to make.  Do you implement a quick and simple check-sum that is likely to miss SOME errors?  Or do you take the time and resources needed to implement a CRC and obtain a much higher rate of error detection?  Depending upon the application and the results of missing an occational error will dictate if you can live with a simple check-sum, or do you need the reliablility of the CRC.

the probability of not detecting an error with checksum is higer than what it was stated before
the math it's no so simple, but you can read it here

 Error Masking Probability of 1's Complement Checksumsciteseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.79.1950&rep=rep1&type=pdf
the documment has some math font missing but you can read it on google preview searching it by its name.

if the BER could be high I would use a 32 bits CRC, it's just a table and a couple of additional operations compared to a simple sum but it's way safer than an 8 bit checksum.
SOLUTION
Avatar of pmasotta
pmasotta

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Given the assumptions I made, the math is fine for an 8 bit checksum. Of course, random scrambling isn't as likely as a few altered bits, but you need to know the distribution of error in your channel to really predict false positive and false negative rates.

False positive is when an error ruins the checksum, but the payload is unaltered.
False negative is when the error is missed.
Given the assumptions I made, the math is fine for an 8 bit checksum
It’s seems to me you forgot considering the error/s that end up giving the same checksum...

but you need to know the distribution of error in your channel to really predict false positive and false negative rates.
not really, the distribution of error becomes a flat BER , the worst BER gives the worst senario, the probabilities are finally a function of the BER, see the paper.

SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Most environments produce exponential/poisson distributions for error, but you should verify that it is such before making assumptions. It's the most common error we see in the math and statistics world and the thing I most frequently cite when reviewing papers.
1) you are right your model included (and also you said it is completelly unrealistic) the assumption of a message that gets uniform randomly scrambled.
2) I really never saw an Internet/Network device specifying more than a BER (when they do it...)
Ok, so if I have an 8KB packet, and an acceptable BER of 10^-6 (not too hard) and an assumption of random errors, is a simple 8 bit check sum adequate?
Anybody have an equation where I can vary packet size, BER, and get out the number of bits required in the checksum?
Thanks.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
The paper sited in the discussion was good, however didn't entirely answer the question.
Since CRC is pretty easy to implement I'll probably just do that.
It was interesting to learn Bluetooth L2CAP has error correction but that isn't in all modes.
Thanks all for the discussion.