Link to home
Start Free TrialLog in
Avatar of gorexy
gorexy

asked on

Short Text Compression

Hi,
  I have a short text which is about 29 character in max.  29 characters contains alphanumeric (0-9 and A-Z, total 10+26=36 characters)

  My question is that is it possible to compress this data stream into in max. 95 bits??  I have tried to use multibased method and cannot compression all, only a portion.  My thought is Golomb-Rice code or statistical method like Huffman/Aritmetic may help??  Any experts can advice?

Thanks!
ASKER CERTIFIED SOLUTION
Avatar of 1dot44mb
1dot44mb
Flag of Türkiye image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Avatar of d-glitch
d-glitch
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Just to be clear:

You are limiting yourself to    2^95 = 3.96 x 10^28    compressed representation.

But there are    36^29 = 1.35 x 10^45    possible target strings.

That is a lot of ground to cover.
Avatar of gorexy
gorexy

ASKER

Hi,
  I am studying SMAZ suggested by 1dot44mb.
My input string has 2 parts , the first part is about 1-12 bytes (alphanumeric)
example: <abcdefrdhsh3>
the second part like:

<1971,364 ,1971, 32765, 32765>

1971 = max. no of integers. so as to 364, 32765 etc...

example:
1234 285 1023 3221 999
so the input string like
<abcdefrdhsh3><123428510233221999>

contains all number only.

Thanks
SMAZ is optimized for English.

It probably uses Huffman encoding based on the frequency of letters,
combinations of letters (th, sh, ...), and maybe ever common words (the, of, ...).

It doesn't claim to handle numbers.
I expect it will not be able to compress your target strings much if at all.

Avatar of gorexy

ASKER

oh..sorry I cannot create the exe successfully...anyone can help??

I am using Vista and Dev++, can I make the exe and try it?

Thanks