Link to home
Create AccountLog in
Avatar of AlphaLolz
AlphaLolzFlag for United States of America

asked on

ratio between file sizes and strings

I'm very confused by a result one of my developers is saying he's seeing.

This is all using a Java application he wrote running on Windows.

We have a 40 MB file that he's reading into a string.  It's an XML file  It has several small tags with english data and then a tag with base64 encoded content.

He's claiming that when he loads this into a string it's nearly 500 MB in size.  That makes utterly no sense to me.

At most (with 16-bit characters), I would think 80 MB.

Is there some sort of ratio that's reasonable to expect of something like this?  This seems to be horribly wrong.
Avatar of Mick Barry
Mick Barry
Flag of Australia image

how is he measuring the size?
40->500 is not looking good at all. something is definitely wrong here. apart from the question above how is the file getting loaded. in normal case an xml file should be loaded in the same size.
Can you post the following info
i) application process size when the string is loaded in to memory.
ii) Print the length of the length of the string when the xml file is loaded into the string.
iii) Dump the loaded string into another file and post the file size.
ASKER CERTIFIED SOLUTION
Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
:)
> Yes. unencoded/base64 encoded should be around 3/4

thats rubbish, you'll find its a lot bigger than that
>>thats rubbish, you'll find its a lot bigger than that

You obviously need to learn how base64 encoding works:

http://en.wikipedia.org/wiki/Base64
you'll find there a bit more involved than just base64