zlib unexpected behavior

Hi, please refer to the code below...
I have a array of unsigned chars (dcm->strm_pixels_) that I want to compress using zlib. As you can see it compresses fine from 1572864 bytes to 965821 bytes.
However when I try to uncompress it partially (using the first 2000 bytes in the compressed array, it says that 3287509 bytes were generated.. which is more than the original size.
but when I try to uncompress the full array, it shows the correct uncompressed size.

what am i missing?

thanks

uLongf dlen, slen;
cout << "Size before compression:\t" << dcm->total_bytes_ << endl;
unsigned char *dest = (unsigned char *)calloc(dcm->total_bytes_, sizeof(unsigned char));
compress(dest, &dlen, dcm->strm_pixels_, dcm->total_bytes_);
cout << "Size after compression:\t" << dlen << endl;
uncompress(dcm->strm_pixels_, &slen, dest, 2000);
cout << "Size after partial un-compressing:\t" << slen << endl;
uncompress(dcm->strm_pixels_, &slen, dest, dlen);
cout << "Size after full un-compressing:\t" << slen << endl;
//
//*********************
//Output
//*********************
//
Size before compression:	1572864
Size after compression:	965821
Size after partial un-compressing:	3287509
Size after full un-compressing:	1572864

Open in new window

wevouchAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

jkrCommented:
You cannot 'partially uncompress'  such a buffer, since the zip algorithm treats that as an entity with format information embedded in the compressed result (see also http://en.wikipedia.org/wiki/Lempel-Ziv-Markov_chain_algorithm). A compressed buffer needs to be expanded as a whole.
0
wevouchAuthor Commented:
I do not think thats true.

I have written test code that works as expected. As you can see in the attached code below, I only uncompress the first 500 bytes of the compressed buffer.
You will see that the value at 11th and 5001 elements are reset to zero after compression.. and when I uncompress only the 11th value is retreived.

Also.. on linux the zcat archive.gz | more command does infact does this.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <assert.h>
#include <iostream.h>
#include "zlib.h"
 
#define MAX 25000
#define MAX2 25000
using namespace std;
 
int main() {
	unsigned char *src;
	unsigned char *dest;
 
	src = (unsigned char *)calloc(MAX, sizeof(unsigned char));
	dest = (unsigned char *)calloc(MAX2, sizeof(unsigned char));
 
	for (int i=0; i<MAX; i++) {
		src[i] = (double(rand())/RAND_MAX)*60;
		//cout << (int)src[i] << endl;
	}
	uLongf dlen, slen;
	compress2(dest, &dlen, src, MAX, 9);
	cout << "Final bytes: " << dlen << endl;
	cout << (int)dest[10] << endl;
 
	printf("Hello World!!! %d %d\n", src[10], src[5000]);
	for (int i=0; i<MAX; i++) {
		src[i] = 0;
	}
 
	printf("Hello World!!! %d %d\n", src[10], src[5000]);
	uncompress(src, &slen, dest, 500);
	printf("Hello World!!! %d %d\n", src[10], src[5000]);
 
	free(src);
	free(dest);
}
//
//****************
//Output
Final bytes: 18691
205
Hello World!!! 28 19
Hello World!!! 0 0
Hello World!!! 28 0
//****************

Open in new window

0
DanRollinsCommented:
One thing that can recommend:  Check the return value for the compress, compress2 and uncompress functions.  
Also, initialize the the values for all input and output variables.
0
The Ultimate Tool Kit for Technolgy Solution Provi

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy for valuable how-to assets including sample agreements, checklists, flowcharts, and more!

Duncan RoeSoftware DeveloperCommented:
You should only attempt partial decompression if you checkpointed on compression using Z_SYNC_FLUSH. Otherwise, data isn't even byte-aligned. You must have got lucky when 500 bytes worked.
Just in case you didn't know - documentation is in /usr/include/zlib.h
0
wevouchAuthor Commented:
Duncan thanks.
how does "zcat archive.gz | more" work then?
0
DanRollinsCommented:
As I suspected, the return value from your call to uncompress is
-5 (Z_BUF_ERROR)
which is returned if there is not enough room in the output buffer.  According to the documentation
    http://www.zlib.net/manual.html#uncompress
the output buffer "must be large enough to hold the entire uncompressed data."
As to how zcat does it... that's simple:  It decompresses the entire file and sends the stream down the pipe to more.  It appears to have decompressed just a piece because the more command is showing just a piece.
0
wevouchAuthor Commented:
Hi Dan,
I'll check the return error. but this is weired.. my output buffer is the same where the original uncompressed stream was... so output buffer should be just the right size.

also about zcat. it does not uncompress entire file first. my zip files are 4-5 GB in size. uncompressing the whole thing entirely takes anywhere from 30-45 minutes. however zcat | more starts printing immediately.
0
Duncan RoeSoftware DeveloperCommented:
When you use zlib, you get chunks of uncompressed file back continuously. That's how zcat|more works. But... you have no control over how much input the zlib decompressor takes - it just sucks in whatever you give it and now and then outputs a block's worth of uncompressed file. When it does that, it still has an internal state and may not have used all the bytes in the last buffer you gave it (including a fraction of a byte).
0
DanRollinsCommented:
>> my output buffer is the same where the original uncompressed stream was... so output buffer should be just the right size.
Check your code.  
 uLongf dlen, slen;
....
uncompress(src, &slen, dest, 500);
You will note (as I pointed out earlier) that the value of the variable slen has not been initialized.  But the real problem is the value of the fourth parameter.  The function succeeds if it is set to MAX.
0
wevouchAuthor Commented:
slen is passed by reference to uncompress. that function sets the value of slen.


0
DanRollinsCommented:
The fact that slen ends up as a random value is because you did not check the return code and notice that it was indicating an input error.  
It is critical that you check the return code.  It is less critical, but important that you initialize variables to a known values (say, 0) when passing them by reference to an API function.
0
Duncan RoeSoftware DeveloperCommented:
Yes it is but you have to set it to buffer capacity prior to entry. From zlib.h:

"Upon entry, destLen is the total size of the destination buffer, which must be large enough to hold the entire uncompressed data. (The size of the uncompressed data must have been saved previously by the compressor and transmitted to the decompressor by some mechanism outside the scope of this compression library.)
0
Duncan RoeSoftware DeveloperCommented:
To perhaps better answer a previous question - zcat | more starts straight away because zcat is using deflate(), not the utility function uncompress().
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
wevouchAuthor Commented:
so does that mean.. i can use deflate as well and achive the same thing?

all i want to do is, compress my source data and end up with one compressed buffer. Now I want to be able to uncompress this buffer in parts...
0
Duncan RoeSoftware DeveloperCommented:
You use inflate() to decompress, deflate() to compress. Sorry I mixed that up in my previous post.
Yes you can use inflate() to decompress only as much as will fit into your buffer (assuming that much data is available)
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
System Utilities

From novice to tech pro — start learning today.