Compressing a simple Serial message

I have an application that needs to send a serial message that represents a colour grid of say 50*50 cells

Each cell in the grid can be 0 (black) ....1 (green)... etc.... upto  7 (red) but most cells are black.

Is there a simple and quick way to compress the string. I thought possibly detecting a black cell then sending number of consecutive back cells after that.

Sample ASCII data:

"3740000000000000000000000000000000000000000000167674001100000000000000000000000000000000000000011275113000000000000000000000000000000000000000120230310000000000000000000000000000000000000000122132210000000000000000000000000000000000000000123321100000000000000000000000000000000000000000003131100000000000000000000000000000000000000000044421000000000000000000000000000000000000000000022411000000000000000000000000000000000000000000224110000000000000000000000000000000000000000001114420000000000000000000000000000000000000000001303300000000000000000000000000000000000000000000313300000000000000000000000000000000000000000002243100000000000000000000000000000000000000000001245100000000000000000000000000000000000000000001363000000000000000000000000000000000000000000001152000000000000000000000000000000000000000000012310000000000000000000000000000000000000000000001200000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"

So
1st cell is colour 3,
2nd cell is colour 7
3rd cell is colour 4
The next 43 cells are black.

I have to do this as serial comms is at it's limits and messages are too long to send. Increasing baud rate is not an option.
oddszoneAsked:
Who is Participating?
 
Todd GerbertConnect With a Mentor IT ConsultantCommented:
That's a good idea you've got - not terribly difficult in C (I would imagine, anyway, I have only little C experience but I think even I could muddle through that), and almost trivial in C# thanks to Regular Expressions.

The only problem with that is if there are 387 consecutive 0's, then your compressed string is "1234838784321", now you have a 3 surrounded by 8's, and a 7 surrounded by 8's, and no way to know that's supposed to be a 387 surrounded by 8's.  However, "1234A387A4321" is distinguishable. And to go a step further, you could use A's to represent 0's; B's for 1's; C's for 2's, etc.

This little C# snippet will find any repeating character, between 0 and 7, at least 4 characters long and replace it with A<number of repetitions>A - H<number of repetitions>H.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;

class Program
{
	static void Main(string[] args)
	{
		string testMessage = @"3740000000000000000111110000000077777000000000167674001100000000";
		string compressedMessage = CompressMessage(testMessage);
		string decompressed = DecompressMessage(compressedMessage);

		Console.WriteLine("Original:\t{0}", testMessage);
		Console.WriteLine("Compressed:\t{0}", compressedMessage);
		Console.WriteLine("Decompressed:\t{0}", decompressed);

		Console.ReadKey();
	}

	static string CompressMessage(string message)
	{
		return Regex.Replace(message, @"([0-7])\1{3,}", (Match m) => String.Format("{0}{1}{0}", (char)(m.Groups[1].Value[0] + 17), m.Value.Length));
	}

	static string DecompressMessage(string message)
	{
		return Regex.Replace(message, @"([A-H])(\d+?)\1", (Match m) => new String((char)(m.Groups[1].Value[0] - 17), Int32.Parse(m.Groups[2].Value)));
	}
}

Open in new window

0
 
Luis PérezSoftware Architect in .NetCommented:
Try this, I think it can help you:
http://madskristensen.net/post/Compress-and-decompress-strings-in-C.aspx

Hope that helps.
0
 
oddszoneAuthor Commented:
That doesn't seem to correctly decompress as some of message is chopped. It might be me though I'll check.

BUT, I also need to make the same routine in native "C" code for the embedded side of the project so a library is not the ideal solution as I don't know compression algoryhm in your example.
0
Cloud Class® Course: Microsoft Windows 7 Basic

This introductory course to Windows 7 environment will teach you about working with the Windows operating system. You will learn about basic functions including start menu; the desktop; managing files, folders, and libraries.

 
Luis PérezSoftware Architect in .NetCommented:
I sent the link to you because you posted your question in C# zone and the code in the link is C#.

The algorythm basically uses a GZipStream object (http://msdn.microsoft.com/en-us/library/system.io.compression.gzipstream.aspx) that is a .Net framework native object that interally uses the Zip algorythm to compress Streams (in this case Streams containing text strings).
0
 
Luis PérezSoftware Architect in .NetCommented:
I've tested the code with your data (in "Sample ASCII Data") with the following statistics:

Original data string (the one in your example): 1927 characters.
Compressed data string: 360 characters (82% compression aprox).

Once decompressed, the string data is exactly the same than the original.

Hope that helps.
0
 
oddszoneAuthor Commented:
Yes I should have mentioned the C thing sorry.

I don't think the code in the link is working 100% either.

+++++++++++++

UPDATE
======

I need a solution that could easily be converted to 'C' for the embedded side

Thanks
0
 
oddszoneAuthor Commented:
Just checked and C# code DOES work but my serial couldn't cope with the data so I thought it was wrong :(
0
 
Todd GerbertIT ConsultantCommented:
If you only have 8 colors, 0-7, then you really only need 4 bits to represent those potential choices - using ASCII characters is sending 4 bits than are really needed.  For example, to send the four colors 3, 7, 4, and 0 you can bit-shift 3 left by 4 bits and OR it with 7.  Then do the same thing for 4 and 0.  Send the 2-byte string "w@" instead of the 4-byte string "3740".  Will cut your data size in half.
0
 
Todd GerbertIT ConsultantCommented:
...if you ever need to send an odd number of colors, you'd need to implement some kind of header to the message to indicate how many colors follow (in the case of an odd number of cells, you'd need to indicate that only the first 4 bits of the last byte should be used, and discard the least significant 4).
0
 
Todd GerbertIT ConsultantCommented:
Actually, you only need 3 bits.  You could take the three least snignificant bits from 7 color values, and re-combine them to make 3 7-bit bytes.  This would reduce data size by less than half, and keep each resulting value under 128 so it can still be represented by ASCII.
0
 
oddszoneAuthor Commented:
I've worked out if I replace

"00000000000000000000" with "9"
and
"000000000" with "8"

I get

37498000167674001190000000001127511390000000001202303198122132219812332119800031311980044421980002241198002241198001114429800130339800003133980002243198000124519800013639800001152980001231980000012980000001999999999999999999999999999999999880000000

Which is 249 characters.
0
 
oddszoneAuthor Commented:
Or perhaps
replace consecutive "0"s with "8xx8" where xx is 0 count.

This gives

374843816767400118398112751138398120230318408122132218408123321184383131184284442184382241184282241184281114428428130338448313384382243184381245184381363844811528438123184581284681810178

which is only 187 chars.

Would this pack/unpack code be easy write in C anc C#?
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.