Zip File Ratio Compression Ratio Prediction in C#

Hello,

I want to predicate Zip File Compression ratio before creating zip file in C# (VS 2010)

thanks

Kalpesh
LVL 16
Kalpesh ChhatralaSoftware ConsultantAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

AndyAinscowFreelance programmer / ConsultantCommented:
65%
Or just pick another any other number you feel like.  Anything is a guess - the compression is heavily dependent upon the actual data..
0
Jacques Bourgeois (James Burger)PresidentCommented:
Did you ever saw a program that does that? If not, then its probably because you can't. If you know of one, try it and see if it gave you the right information before you compressed. It was probably coded by Andy, who has the best algorithm I can think of for that purpose.

If you could, the class that you use to perform the compression would have a property or a method that would give you that information.

You can always compress in a MemoryStream, which is usually faster than any type of FileStream, and then retrieve its Length. But you need to do the job before you can have that information.

It's like trying to determine the time it will take you to complete a programming project before you start. You have to do it first, and then tell you customer or your boss how much time it will take. :-)
0
frankhelkCommented:
Predicting the compression ratio precisely is - as AndyAinscow told - not possible due to the nature of zip compression itself - the compression factor depends on the data and even some parts of a file will comperess better or worse than others. The terminus for the influencing property of the data is "entropy", or - somewhat less technical - how chaotic (or uniform) the data is. The best compression would be achieved if each and every byte of the data is the same (i.e. a file full of null bytes) ... then the compressed file would only contain the info "xxx bytes 0x00", even if the file contains terabytes of (null) data. The lousiest compression - if any at all - would be achieved for data that is completey random like white noise. Good algorithms would respect changes of uniformity in the data stream to adapt.

The only way to get more or less near to a prediction is to use experience from earlier compression cases, i.e. by file type. TXT and CSV files with only ASCII data would compress good, as database files usually would, too. Programs are usually more chaotic and compress less.

If you know what to compress, just do a switch-case on file type and use predefined compression ratios derived from experimental compressing a lot of such files. As default you cold use an average.

If you're into building a very intelligent thing, you'll might to do "learning" by adding each compression result to the experience of your program .. with max, min and average compression along with some statistics you might be able to predict a "best case", "average" and "worst case" compression prediction. But that would be a lot of effort for that ....
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Kalpesh ChhatralaSoftware ConsultantAuthor Commented:
Thanks.
0
AndyAinscowFreelance programmer / ConsultantCommented:
Obviously you didn't understand my comment unlike the other experts.

(It was probably coded by Andy, who has the best algorithm I can think of for that purpose.)
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
C#

From novice to tech pro — start learning today.