gorexy
asked on
Short Text Compression
Hi,
I have a short text which is about 29 character in max. 29 characters contains alphanumeric (0-9 and A-Z, total 10+26=36 characters)
My question is that is it possible to compress this data stream into in max. 95 bits?? I have tried to use multibased method and cannot compression all, only a portion. My thought is Golomb-Rice code or statistical method like Huffman/Aritmetic may help?? Any experts can advice?
Thanks!
I have a short text which is about 29 character in max. 29 characters contains alphanumeric (0-9 and A-Z, total 10+26=36 characters)
My question is that is it possible to compress this data stream into in max. 95 bits?? I have tried to use multibased method and cannot compression all, only a portion. My thought is Golomb-Rice code or statistical method like Huffman/Aritmetic may help?? Any experts can advice?
Thanks!
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Just to be clear:
You are limiting yourself to 2^95 = 3.96 x 10^28 compressed representation.
But there are 36^29 = 1.35 x 10^45 possible target strings.
That is a lot of ground to cover.
You are limiting yourself to 2^95 = 3.96 x 10^28 compressed representation.
But there are 36^29 = 1.35 x 10^45 possible target strings.
That is a lot of ground to cover.
ASKER
Hi,
I am studying SMAZ suggested by 1dot44mb.
My input string has 2 parts , the first part is about 1-12 bytes (alphanumeric)
example: <abcdefrdhsh3>
the second part like:
<1971,364 ,1971, 32765, 32765>
1971 = max. no of integers. so as to 364, 32765 etc...
example:
1234 285 1023 3221 999
so the input string like
<abcdefrdhsh3><12342851023 3221999>
contains all number only.
Thanks
I am studying SMAZ suggested by 1dot44mb.
My input string has 2 parts , the first part is about 1-12 bytes (alphanumeric)
example: <abcdefrdhsh3>
the second part like:
<1971,364 ,1971, 32765, 32765>
1971 = max. no of integers. so as to 364, 32765 etc...
example:
1234 285 1023 3221 999
so the input string like
<abcdefrdhsh3><12342851023
contains all number only.
Thanks
SMAZ is optimized for English.
It probably uses Huffman encoding based on the frequency of letters,
combinations of letters (th, sh, ...), and maybe ever common words (the, of, ...).
It doesn't claim to handle numbers.
I expect it will not be able to compress your target strings much if at all.
It probably uses Huffman encoding based on the frequency of letters,
combinations of letters (th, sh, ...), and maybe ever common words (the, of, ...).
It doesn't claim to handle numbers.
I expect it will not be able to compress your target strings much if at all.
ASKER
oh..sorry I cannot create the exe successfully...anyone can help??
I am using Vista and Dev++, can I make the exe and try it?
Thanks
I am using Vista and Dev++, can I make the exe and try it?
Thanks
http://www.maximumcompression.com/data/text.php