zambak
asked on
Writing 5 bits at the time to a file
Hi
I have assignment for some wierd compression alghoritam that will read
in from a file convert characters to 5 bit codes and then write out
compressed
version of the original file. For example if input file contains string
"AMIR"
and the codes are A=0, M=12,I=8,R=17 i am supposed to write out 3 byte file
The problem? How do I figure out shifting because I can only write out bytes
and not
bits...I figure out I would "create" a byte and write it but the codes will
overlap.
In my example I would have something like this in binary
00000001 00010001 00010000
where first 5 bits are hex 0, second 5 bits are 12 and so on...
Sometimes one created byte is based on codes of 3 different input
characters.
I have a loop in which I read a char at the time from input file, create a
int code and now I have to figure out
an alghoritam that would create a byte ( a char) that I can write out to
output file
// get char at the time from input
while ((curr_chr = getc(in_file)) != EOF)
{
// get code for it
code = getCode(curr_chr);
// shifting code ......
// write created byte...
fwrite(&wbyte, 1, 1, out_file);
}
Any help is much appreciated....
Thnaks
zambak
I have assignment for some wierd compression alghoritam that will read
in from a file convert characters to 5 bit codes and then write out
compressed
version of the original file. For example if input file contains string
"AMIR"
and the codes are A=0, M=12,I=8,R=17 i am supposed to write out 3 byte file
The problem? How do I figure out shifting because I can only write out bytes
and not
bits...I figure out I would "create" a byte and write it but the codes will
overlap.
In my example I would have something like this in binary
00000001 00010001 00010000
where first 5 bits are hex 0, second 5 bits are 12 and so on...
Sometimes one created byte is based on codes of 3 different input
characters.
I have a loop in which I read a char at the time from input file, create a
int code and now I have to figure out
an alghoritam that would create a byte ( a char) that I can write out to
output file
// get char at the time from input
while ((curr_chr = getc(in_file)) != EOF)
{
// get code for it
code = getCode(curr_chr);
// shifting code ......
// write created byte...
fwrite(&wbyte, 1, 1, out_file);
}
Any help is much appreciated....
Thnaks
zambak
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
i was referring to exactly that... you are on the right track
Cheers :o)
Cheers :o)
hi
You have to set bit by bit in wbyte.
Convert the code in to a binary array of 5 elements. (array is preferable)
Use a mask which is initially 0x80. Start from first element of array. If it is one then bitwise or the wbyte with the mask else do not or the mask. Now left shift the mask by 1 bit. Check if the mask is 0. If mask is 0 then write the created byte and reset mask to 0x80 and wbyte to 0 and continue.
It would be something like:
mask = 0x80
wbyte = 0;
while ((curr_chr = getc(in_file)) != EOF)
{
// get code for it
code = getCode(curr_chr);
//convert to binary array bincode
for(i=0;i<5;i++)
{
if (bincode[i] == 1)
wbyte = wbyte | mask;
mask = mask>>1;
if (mask == 0)
{
// write created byte...
fwrite(&wbyte, 1, 1, out_file);
mask = 0x80;
wbyte = 0;
}
}
}
You have to set bit by bit in wbyte.
Convert the code in to a binary array of 5 elements. (array is preferable)
Use a mask which is initially 0x80. Start from first element of array. If it is one then bitwise or the wbyte with the mask else do not or the mask. Now left shift the mask by 1 bit. Check if the mask is 0. If mask is 0 then write the created byte and reset mask to 0x80 and wbyte to 0 and continue.
It would be something like:
mask = 0x80
wbyte = 0;
while ((curr_chr = getc(in_file)) != EOF)
{
// get code for it
code = getCode(curr_chr);
//convert to binary array bincode
for(i=0;i<5;i++)
{
if (bincode[i] == 1)
wbyte = wbyte | mask;
mask = mask>>1;
if (mask == 0)
{
// write created byte...
fwrite(&wbyte, 1, 1, out_file);
mask = 0x80;
wbyte = 0;
}
}
}
sorry i was late. I did not refresh the page for quite some time
ASKER
Between the time i posted a question your answer I tried following which is probably something you are suggesting also....The least multiple for 5 (number of bits of compression code) and 8 (number of bits in byte) is 40.....So I can write globs of 40 bits (5 bytes) at the time for every 8 characters of input.
My function reads in 8 characters from input file into a char buffer. Then I have unsigned char glob buffer which is 5 bytes long. I take the original 8 chars and do bit shift manipulation. Then I can fwrite the glob to a file...In case i get EOF before full set of 8 i just padd the array.
And yes I am keeping a header of the input file's original size in bytes which will tell me where to "cut off" when decompressing.
Is this what you were reffering also?
Code of my function is below
void compressFile(const char *filename)
{
struct stat fa; // file attributes structure
FILE *in_file; // pointer to original file
FILE *out_file; // pointer to a compressed file
char curr_chr; // current date without null terminator;
char input[8];
char glob[5];
int i,j,done=0;
// open original file for reading
in_file = openFile(filename, "r");
// obtain attributes for in file
fstat(fileno(in_file), &fa);
fprintf(stdout,"|>>> Compressing file \"%s\" (filesize = %ld bytes) on %s\n", filename, fa.st_size, getCurrentDate());
// if file is empty return
if (fa.st_size == 0)
{
fprintf(stdout, "|>>> File is empty! Nothing to compress...\n");
closeFile(in_file);
return;
}
// open output file for writing
out_file = openFile("output.cp", "w");
// write the header with date and file size out
fwrite(getCurrentDate(), 8, 1, out_file);
fwrite(&(fa.st_size), 4, 1, out_file);
while (!done)
{
// clear out input bufffer
bzero(input,8);
// fetch 8 characters at the time used to create 5 byte glob
for (i=0; i<8; i++)
{
curr_chr = getc(in_file);
if (curr_chr == EOF)
{
for (j=i; j<8; j++)
input[j] = 'A'; // this will padd with 00000 */
done = 1;
}
else
input[i] = curr_chr;
}
// create a glob based on input
glob[0] = (getCode(input[0]) << 3) + (getCode(input[1]) >> 2);
glob[1] = ((getCode(input[1]) & 0x03) << 6) + (getCode(input[2]) << 1) + (getCode(input[3]) >> 4);
glob[2] = ((getCode(input[3]) & 0x7F) << 4) + (getCode(input[4]) >> 1);
glob[3] = ((getCode(input[4]) & 0x01) << 7) + (getCode(input[5]) << 2) + (getCode(input[6]) >> 3);
glob[4] = ((getCode(input[6]) & 0x07) << 5) + getCode(input[7]);
printGLOB(glob);
// write out the glob to an output file
fwrite(&glob, 5, 1, out_file);
//printf("Current character = [%c] ASCII=[0x%X] code = 0x%X\n", curr_chr, (int)curr_chr, code);
} // end while
// close files
closeFile(in_file);
closeFile(out_file);
}