Solved

What is the most effective way to apply CRC32?

Posted on 2004-08-08
5
592 Views
Last Modified: 2008-01-09
I am implementing a protocol for use in direct modem-to-modem connections.

This is the packet format:

0x7e (start of packet)
  <send sequence number, acknowledged sequence, and data area 0 to 512 bytes>
  <CRC32>
0x7e (end of packet)

Where the meaning of "send sequence number" and "acknowledged sequence number" are exactly as in TCP.
The data is escaped using HDLC conventions (0x7d 0x5d, 0x7d 0x5e) but escaping is done *after* CRC calculation because the CRC itself is subject to escaping.

My question is about the most effective way to apply the CRC32. I have two possible approaches:

1. CRC the header and the data area, EXCLUDING the CRC area, then append the CRC32 to the packet. On the receiving end, do the same (checksum all but the CRC area) and do a direct comparison to the appended CRC.

2. CRC the header, the data area, *AND* the CRC (initialized to zero), ie. checksum the WHOLE packet, including the four zeros where the CRC is located, then overwrite the 4 zeros with the calculated CRC. On the receiving end, checksum the whole packet, *including* the appended CRC and look for a "magic CRC" value that will always be the result if the packet has not been corrupted.

My first thought was approach #1 was simpler and was just as effective as approach #2 might be. Then I got thinking about how CRC detects all possible one bit errors, and all possible two bit errors, etc. If I apply approach #2, I get the full effectiveness of CRC on the CRC itself. If I apply approach #1, the CRC itself is not "protected" by the CRC.

Is approach #2 stronger than approach #1? Any comments about the strongest approach, or links to online references would be great. I have already found many web pages on CRC, but none that address the checksumming of the CRC itself, and the increased effectiveness of doing so.
0
Comment
Question by:adg080898
  • 2
  • 2
5 Comments
 
LVL 11

Accepted Solution

by:
lbertacco earned 500 total points
Comment Utility
Approach 2 doesn't "protect" the crc area anymore than approach 1. In fact, with approach 1 you crc header+data; with approach 2: header+data+4zerobytes; then transmit header+data+crc. So approach2 is the same thing as doing some (payload indipendent) invertible transformation to the crc of approach 1 before transmitting it, which doesn't help error checking.
0
 
LVL 4

Expert Comment

by:alikoank
Comment Utility
I do not think possible approach #2 is possible at all. When you create a checksum with four zeros in place of CRC sum, and then overwrite the four zeros with checksum you modify the original data thus invalidate your checksum.

Additionally you do not have to check your checksum against corruptions. Think the following scenarios:

. If your data is corrupted, its checksum on the receiveing end will not match original checksum added to the end of your data.

. If your data transferred inact but your checksum is corrupted then again the checksum of your data will not match the checksum value transferred over network.

so approach #2 is not possible and not needed. Approach #1 already provides enough checks against data & checksum corruption.
0
 
LVL 11

Expert Comment

by:lbertacco
Comment Utility
Approach 2 is somewhat possible but not exactly as as adq presented it. I mean, it is always possible to find a 32bit integer x such that
crc32(data -concatenated with- x) = some given constant K.
Thus it is possible to have the receiving end just crc the whole packet and check that the result is equal to K (rather than crc-ing the packet minus crc field and comparing the result with the crc field). However it's not enough to calculate crc32(data -concatenated with- 32bit 0) and use that as x.
0
 
LVL 8

Author Comment

by:adg080898
Comment Utility
To alikoank:
I have come across, in at least one CRC reference that I remember, that some implementations of CRC will checksum right through the CRC, and will then expect a zero CRC to be correct. It mentioned that this is a "bonus" for CRC because it makes it much easier to do CRC in hardware because it doesn't have to "skip" the CRC, and it is very easy to test for zero in hardware (no "compare" just a wide NAND gate).

lbertacco:
You highlighted something that could potentially *weaken* approach 2. Since it is checksumming an additional 4 bytes, it might VERY slightly reduce the effectiveness of the CRC. Or maybe not...would it just jump around the CRC lookup table a little bit and have no real effect - performing a transform that is essentially a no-op as far as error detection is concerned?

Everybody:
Should I just forget about all this "one bit error" "two bit error" stuff and just accept that there is a 1 in 2**32 chance that the packet is corrupted AND the checksum matches?

I am worried slightly by something else. If line noise corrupts a data byte into a 0x7e then it will confuse the code that finds packet boundaries. It will then take the four bytes before the corrupted byte and treat THEM as the CRC32. Then, when it encounters the "real" 0x7e, it will again do a bogus checksum test of the remainder of the "real" packet. This is scary because the data being checked (in both cases) is drastically different from the original packet - maximizing the chance of a false positive. In this case, the best checksum on the planet boils down to its width: how likely is a false match.

I do plan to detect bad escapes (a 0x7d that is followed by something other than a 0x5d or 0x5e) and drop the packet if any are encountered, so this might buy me a little bit of extra protection.

I was considering storing the CRC32 followed by the inverted CRC32 to make a false positive incredibly unlikely. It does scream "overkill" though! :)

Am I being overly cautious? Practically every modem currently in use in PCs has error detection algorithms (V.42 / MNP / etc). I believe modems use a CRC16 on their "frames" and their frames are very small. Is it safe to say that it is virtually impossible for an error to get past V.42 AND a 32-bit application-level checksum? Somewhere between 1 in 2**32 and 1 in 2**48 likelyhood?

If anybody has some empirical knowledge about modem reliability to share (the frequency of corruption getting past modem error detection), that would be terrific.
0
 
LVL 8

Author Comment

by:adg080898
Comment Utility
Correction: Actually, the final checksum in that hardware-CRC case may not necessarily be zero, but it is at least constant.
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Suggested Solutions

Purpose To explain how to place a textual stamp on a PDF document.  This is commonly referred to as an annotation, or possibly a watermark, but a watermark is generally different in that it is somewhat translucent.  Watermark’s may be text or graph…
Whether you've completed a degree in computer sciences or you're a self-taught programmer, writing your first lines of code in the real world is always a challenge. Here are some of the most common pitfalls for new programmers.
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now