writing data in binary form

In my large C++ application, I see often times previous developers have written data in binary form. So basically when they have to read they convert it to ASCII form as they know the format of the file, (example 1st 4 bytes are uint32_t, next byte is char etc....)

I am trying to understand what is the benefit of writing in binary form and then when it comes to reading, reconstruct the original human readable form?

Is there any benefit like size reduction by saving in binary form, or faster write processing etc or maybe something else?

P.S: Again these files are only configuration related file like metadata of a file etc...
perlperlAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Dave BaldwinFixer of ProblemsCommented:
ASCII form is just for humans to be able to read it.  All arithmetic is done in binary form as is all addressing including things like memory, MAC, and IP addresses.  Binary is the original and necessary format, not ASCII.
0
perlperlAuthor Commented:
I am talking about the contents of files stored on filesystem
0
jkrCommented:
As David wrote, binary is the format that is native to computing and therefore "faster". Also:

>>Is there any benefit like size reduction by saving in binary form, or faster write processing
>>etc or maybe something else?

It's  a major size reduction. E.g. to express the number 4294967295 (UINT_MAX) you need 10 bytes in ASCII, whereas the same can be done with just four bytes in binary.

The basic question probably is: Do you need the saved to be human readable? If the aswer is "yes", then use ASCII storage, if "no", binary is preferrable.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Microsoft Azure 2017

Azure has a changed a lot since it was originally introduce by adding new services and features. Do you know everything you need to about Azure? This course will teach you about the Azure App Service, monitoring and application insights, DevOps, and Team Services.

Dave BaldwinFixer of ProblemsCommented:
The binary forms are Required.  Things like numbers for arithmetic and sizes of arrays and offsets into memory areas are used in Binary form.  The ASCII form is useless for that and is Only used to make it readable by humans.  Since the Binary form is required, it is much more efficient to create it that way to begin with.  If you store the data in ASCII form, then you are adding a translation step before it can be used by the system.
0
perlperlAuthor Commented:
Now I got it.
In my case it was more for size reduction. The file was mainly storing uint32 only. and the file had a limit of 4K only. So by saving binary instead of ASCII we can fit more entries in the file.

Thanks a lot.
0
evilrixSenior Software Engineer (Avast)Commented:
Just adding to the information already provided by the other experts...

It's worth noting that binary data isn't portable, whereas text data (generally) is. You have to consider endianess. If you write binary data on a big endian platform and then read it back on a little endian platform (or vice versa) you won't get the original data. This is why, when you send data over a network, you have to convert it to Network Byte Order.

You'll always get the original data when written as text because it is serialised as a byte sequence or characters. Not that this is only write when you are writing "narrow" 8 bit ASCII (extended) text, since each character is just a single byte.

The same is *not* true when you are writing wide or multi-byte text. Each character is going to be multiple-bytes and so the endianness matters. This is why UTF (Unicode Transformation Format) uses Byte Order Marks (BOM), to ensure the text can be reconstructed properly regardless of the endianess of the original platform.

Short answer: if you need the data to be platform independent, human readable and it can be serialised as a byte stream and/or written with a BOM then use text. If you need the data to be machine readable and/or size matters and/or you know you'll be reading and writing on the same platform then use binary. These are rules of thumb - YMMV :)

That all said, you are better off using a proper data serialisation library (such as Boost Serialization) as it will take care of all of these "low level" problems and allow you to just get on with the "business logic" of your program.
0
perlperlAuthor Commented:
Thanks for the information,
I did see the ntohl and htonl in my application while storing/reading data to/from the file
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
C++

From novice to tech pro — start learning today.