Solved

Quick way to compare large array

Posted on 2011-03-15
16
384 Views
Last Modified: 2012-05-11
We have long serial messages received every second (RS232) which is approx 5k length.

We need to monitor this message and set a flag when it is different to the last rx'd message.

What would be the most efficient/quick way to store/compare the data?

Thanks

0
Comment
Question by:oddszone
  • 5
  • 5
  • 3
  • +2
16 Comments
 
LVL 53

Expert Comment

by:Infinity08
ID: 35137294
For a byte by byte comparison, you can use memcmp :

        http://www.cplusplus.com/reference/clibrary/cstring/memcmp/
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 35137306
If all you care about is keeping track of the last message, and whether or not it was different from the previous one, you could check the incoming message against the last message while it's received, and store it in the same buffer on the fly.
0
 
LVL 40

Expert Comment

by:evilrix
ID: 35137307
The simplest solution is at worse case O(N) where N is the length of the messages. First, if the lengths differ you know straight away there are different. Otherwise walk each array side by side and the first time you find a difference you can end the compare. So, compare a1[0] with a2[0] ... a1[N] with a2[N] and abort as soon as a1[x] != a2[x].

Other solutions might be to create a hash digest and compare these but I suspect that the hashing time will be comparable to the linear search.
0
Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

 
LVL 40

Expert Comment

by:evilrix
ID: 35137309
>>If all you care about is keeping track of the last message

...I thought about that and then considered the array is actually the series of bytes being received. I could be wrong.
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 35137345
>> ...I thought about that and then considered the array is actually the series of bytes being received. I could be wrong.

I don't know how the message is being received (the whole message at once ? with a certain block size ? a byte at a time ?), so a bit of clarification on that point would indeed be useful.
0
 
LVL 32

Expert Comment

by:phoffric
ID: 35137351
You can try aligning the messages on word aligned boundaries (typically 2 or 4 bytes), and then on each word, loop over an exclusive or operation to yield 0 if the same, else not 0. This may be faster than memcmp.

Or, in your driver, you could compute a checksum as the stream enters the system, and then just compare the checksums (CRC32 if you can afford it). If equal, then you can do the XOR test to be sure.
0
 

Author Comment

by:oddszone
ID: 35137528
Message would be rx'd at 115200 baud under interrupt control so the serial loop might not get all the message at once but would wait until a CR+LF before processing.

I might need to re-transmit the new message on another port so I definitely need to keep a copy and not just CRC's
0
 
LVL 32

Expert Comment

by:phoffric
ID: 35137736
Now that I think about it, I would expect a memcmp to be fast by breaking up the block into aligned and unaligned sections and using any assembly or microcode optimizations that are available in your CPU.

As you said, you keep the old buffer. As you receive each byte in the new message, if your driver can access the old buffer and compare the incoming byte with the corresponding offset into the old buffer, then when the new buffer input stream completes, you will immediately know whether the old and new buffer are the same. Then you do not have to compute the CRC.
0
 
LVL 32

Expert Comment

by:phoffric
ID: 35137739
Sorry, I don't understand the point about why you are re-transmitting and how that relates the old/new buffer check. (Are you concerned about communication bit errors?)
0
 
LVL 32

Expert Comment

by:phoffric
ID: 35137802
Of course, if using this driver approach, then as soon as you detect a byte difference, then the driver can immediately notify the receiving application of the difference without waiting for the remainder of the message to come through.
0
 

Author Comment

by:oddszone
ID: 35137843
Expmle:

I monitor COM1 serial port.

If I detect a different message on this port then I send it on COM2

So I need whole message to resend.

0
 
LVL 32

Expert Comment

by:phoffric
ID: 35137909
Thanks for clarifying. If you have control over the message structure, then the sender could include the CRC32 in the message (preferably in the header at the front of the message), so that you immediately know whether the CRC's are the same in the old and new message. Then you will be able to immediately Tx the incoming bytes from COM1 to COM2 if the CRC's are different.
0
 
LVL 53

Accepted Solution

by:
Infinity08 earned 500 total points
ID: 35137998
>>  Message would be rx'd at 115200 baud under interrupt control so the serial loop might not get all the message at once but would wait until a CR+LF before processing.

So, in the code, you receive the message byte by byte, right ?

If so, simply compare the current received byte with the corresponding byte in the previous message. If it's the same, do nothing. If it's different, set the flag that it's different, and overwrite the byte in the previous buffer. If the flag is set, continue overwriting the previous buffer until the whole message is received, and then forward the message.

Here's a rough idea in (pseudo) code :
unsigned char buffer[MAX_SIZE] = { 0 };
int is_different = 0;

unsigned char* cur = buffer;
unsigned char b = 0;
while (receive_byte(&b)) {
    if (*cur != b) {
        *cur = b;
        is_different = 1;
    }
    ++cur;
}

if (is_different) {
    send_buffer(buffer, cur - buffer);
}

Open in new window

0
 
LVL 53

Expert Comment

by:Infinity08
ID: 35138011
(or any other variation of the same idea)
0
 

Author Closing Comment

by:oddszone
ID: 35138325
Thanks to all
0
 
LVL 2

Expert Comment

by:preraksheth
ID: 35138332
alternatively to the suggestion of CRC from sender, you could create a CRC check at your end. With every running byte in the stream, keep Xoring with the next byte cumulatively. The end result will tell you if there was a change in the message. You could make it more accurate by taking two bytes (16 bits) for the same purpose
0

Featured Post

Best Practices: Disaster Recovery Testing

Besides backup, any IT division should have a disaster recovery plan. You will find a few tips below relating to the development of such a plan and to what issues one should pay special attention in the course of backup planning.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This is a short and sweet, but (hopefully) to the point article. There seems to be some fundamental misunderstanding about the function prototype for the "main" function in C and C++, more specifically what type this function should return. I see so…
Examines three attack vectors, specifically, the different types of malware used in malicious attacks, web application attacks, and finally, network based attacks.  Concludes by examining the means of securing and protecting critical systems and inf…
The goal of this video is to provide viewers with basic examples to understand and use structures in the C programming language.
The goal of this video is to provide viewers with basic examples to understand opening and reading files in the C programming language.

813 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now