Solved

Compressing a simple Serial message

Posted on 2011-03-07
12
396 Views
Last Modified: 2012-05-11
I have an application that needs to send a serial message that represents a colour grid of say 50*50 cells

Each cell in the grid can be 0 (black) ....1 (green)... etc.... upto  7 (red) but most cells are black.

Is there a simple and quick way to compress the string. I thought possibly detecting a black cell then sending number of consecutive back cells after that.

Sample ASCII data:

"3740000000000000000000000000000000000000000000167674001100000000000000000000000000000000000000011275113000000000000000000000000000000000000000120230310000000000000000000000000000000000000000122132210000000000000000000000000000000000000000123321100000000000000000000000000000000000000000003131100000000000000000000000000000000000000000044421000000000000000000000000000000000000000000022411000000000000000000000000000000000000000000224110000000000000000000000000000000000000000001114420000000000000000000000000000000000000000001303300000000000000000000000000000000000000000000313300000000000000000000000000000000000000000002243100000000000000000000000000000000000000000001245100000000000000000000000000000000000000000001363000000000000000000000000000000000000000000001152000000000000000000000000000000000000000000012310000000000000000000000000000000000000000000001200000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"

So
1st cell is colour 3,
2nd cell is colour 7
3rd cell is colour 4
The next 43 cells are black.

I have to do this as serial comms is at it's limits and messages are too long to send. Increasing baud rate is not an option.
0
Comment
Question by:oddszone
  • 5
  • 4
  • 3
12 Comments
 
LVL 25

Expert Comment

by:Luis Pérez
ID: 35057277
Try this, I think it can help you:
http://madskristensen.net/post/Compress-and-decompress-strings-in-C.aspx

Hope that helps.
0
 

Author Comment

by:oddszone
ID: 35057566
That doesn't seem to correctly decompress as some of message is chopped. It might be me though I'll check.

BUT, I also need to make the same routine in native "C" code for the embedded side of the project so a library is not the ideal solution as I don't know compression algoryhm in your example.
0
 
LVL 25

Expert Comment

by:Luis Pérez
ID: 35057713
I sent the link to you because you posted your question in C# zone and the code in the link is C#.

The algorythm basically uses a GZipStream object (http://msdn.microsoft.com/en-us/library/system.io.compression.gzipstream.aspx) that is a .Net framework native object that interally uses the Zip algorythm to compress Streams (in this case Streams containing text strings).
0
3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

 
LVL 25

Expert Comment

by:Luis Pérez
ID: 35057790
I've tested the code with your data (in "Sample ASCII Data") with the following statistics:

Original data string (the one in your example): 1927 characters.
Compressed data string: 360 characters (82% compression aprox).

Once decompressed, the string data is exactly the same than the original.

Hope that helps.
0
 

Author Comment

by:oddszone
ID: 35057839
Yes I should have mentioned the C thing sorry.

I don't think the code in the link is working 100% either.

+++++++++++++

UPDATE
======

I need a solution that could easily be converted to 'C' for the embedded side

Thanks
0
 

Author Comment

by:oddszone
ID: 35057901
Just checked and C# code DOES work but my serial couldn't cope with the data so I thought it was wrong :(
0
 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 35058192
If you only have 8 colors, 0-7, then you really only need 4 bits to represent those potential choices - using ASCII characters is sending 4 bits than are really needed.  For example, to send the four colors 3, 7, 4, and 0 you can bit-shift 3 left by 4 bits and OR it with 7.  Then do the same thing for 4 and 0.  Send the 2-byte string "w@" instead of the 4-byte string "3740".  Will cut your data size in half.
0
 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 35058276
...if you ever need to send an odd number of colors, you'd need to implement some kind of header to the message to indicate how many colors follow (in the case of an odd number of cells, you'd need to indicate that only the first 4 bits of the last byte should be used, and discard the least significant 4).
0
 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 35062205
Actually, you only need 3 bits.  You could take the three least snignificant bits from 7 color values, and re-combine them to make 3 7-bit bytes.  This would reduce data size by less than half, and keep each resulting value under 128 so it can still be represented by ASCII.
0
 

Author Comment

by:oddszone
ID: 35066933
I've worked out if I replace

"00000000000000000000" with "9"
and
"000000000" with "8"

I get

37498000167674001190000000001127511390000000001202303198122132219812332119800031311980044421980002241198002241198001114429800130339800003133980002243198000124519800013639800001152980001231980000012980000001999999999999999999999999999999999880000000

Which is 249 characters.
0
 

Author Comment

by:oddszone
ID: 35066967
Or perhaps
replace consecutive "0"s with "8xx8" where xx is 0 count.

This gives

374843816767400118398112751138398120230318408122132218408123321184383131184284442184382241184282241184281114428428130338448313384382243184381245184381363844811528438123184581284681810178

which is only 187 chars.

Would this pack/unpack code be easy write in C anc C#?
0
 
LVL 33

Accepted Solution

by:
Todd Gerbert earned 500 total points
ID: 35071499
That's a good idea you've got - not terribly difficult in C (I would imagine, anyway, I have only little C experience but I think even I could muddle through that), and almost trivial in C# thanks to Regular Expressions.

The only problem with that is if there are 387 consecutive 0's, then your compressed string is "1234838784321", now you have a 3 surrounded by 8's, and a 7 surrounded by 8's, and no way to know that's supposed to be a 387 surrounded by 8's.  However, "1234A387A4321" is distinguishable. And to go a step further, you could use A's to represent 0's; B's for 1's; C's for 2's, etc.

This little C# snippet will find any repeating character, between 0 and 7, at least 4 characters long and replace it with A<number of repetitions>A - H<number of repetitions>H.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;

class Program
{
	static void Main(string[] args)
	{
		string testMessage = @"3740000000000000000111110000000077777000000000167674001100000000";
		string compressedMessage = CompressMessage(testMessage);
		string decompressed = DecompressMessage(compressedMessage);

		Console.WriteLine("Original:\t{0}", testMessage);
		Console.WriteLine("Compressed:\t{0}", compressedMessage);
		Console.WriteLine("Decompressed:\t{0}", decompressed);

		Console.ReadKey();
	}

	static string CompressMessage(string message)
	{
		return Regex.Replace(message, @"([0-7])\1{3,}", (Match m) => String.Format("{0}{1}{0}", (char)(m.Groups[1].Value[0] + 17), m.Value.Length));
	}

	static string DecompressMessage(string message)
	{
		return Regex.Replace(message, @"([A-H])(\d+?)\1", (Match m) => new String((char)(m.Groups[1].Value[0] - 17), Int32.Parse(m.Groups[2].Value)));
	}
}

Open in new window

0

Featured Post

Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Article by: Ivo
C# And Nullable Types Since 2.0 C# has Nullable(T) Generic Structure. The idea behind is to allow value type objects to have null values just like reference types have. This concerns scenarios where not all data sources have values (like a databa…
Entity Framework is a powerful tool to help you interact with the DataBase but still doesn't help much when we have a Stored Procedure that returns more than one resultset. The solution takes some of out-of-the-box thinking; read on!
In a recent question (https://www.experts-exchange.com/questions/28997919/Pagination-in-Adobe-Acrobat.html) here at Experts Exchange, a member asked how to add page numbers to a PDF file using Adobe Acrobat XI Pro. This short video Micro Tutorial sh…
Email security requires an ever evolving service that stays up to date with counter-evolving threats. The Email Laundry perform Research and Development to ensure their email security service evolves faster than cyber criminals. We apply our Threat…

773 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question