rakhras
asked on
Determine if a file is ascii or binary
hi,
I am writing a VC++ 5 application. I need to determine if a file is a binary file or an ASCII file. Any ideas?
I tried opening the file and using the function isascii, but this
function succeeds for binary files as well.
Thanks, Ralph
I am writing a VC++ 5 application. I need to determine if a file is a binary file or an ASCII file. Any ideas?
I tried opening the file and using the function isascii, but this
function succeeds for binary files as well.
Thanks, Ralph
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Hey rakhras,
In case you forgot, just wanted to remind you that there is locked question 'Debugging Release version', which you haven't evaluated yet :-)
Wilfred
In case you forgot, just wanted to remind you that there is locked question 'Debugging Release version', which you haven't evaluated yet :-)
Wilfred
nietod:
Don't forget that there are many languages besides English. And, I am not sure if the ASCII files rakhras mentions are really ASCII text files because there are double-byte text files and Unicode text files.
Don't forget that there are many languages besides English. And, I am not sure if the ASCII files rakhras mentions are really ASCII text files because there are double-byte text files and Unicode text files.
Hey, I'm just an ignorant American, my country, and my language are the center of the universe.
But anyway I'll revise my comment to make it more politically correct, even if I'm not.
I still think looking for a binary zero is insufficient. Finding one pretty much garantees that the file is binary, but not finding one does not necessarily mean the file is ASCII.
There are a bunch of procedures in the C++ standard library for testing characters. These procedures work with unicode and multi-byte characters. (I'm not familiar with any of them, because I'm an ignorant American, see above disclaimer) Anyways, these procedures can be used to test each byte, or multi-byte in the file determine if it is a character. If anything that is not a characters is found, the files is binary. If everything is a character (or CR or LF), the file is probably ASCII.
But anyway I'll revise my comment to make it more politically correct, even if I'm not.
I still think looking for a binary zero is insufficient. Finding one pretty much garantees that the file is binary, but not finding one does not necessarily mean the file is ASCII.
There are a bunch of procedures in the C++ standard library for testing characters. These procedures work with unicode and multi-byte characters. (I'm not familiar with any of them, because I'm an ignorant American, see above disclaimer) Anyways, these procedures can be used to test each byte, or multi-byte in the file determine if it is a character. If anything that is not a characters is found, the files is binary. If everything is a character (or CR or LF), the file is probably ASCII.
ASKER
thanks guys for your responses.
I've implemented a mix of your suggestions and it works fine.
So what is the protocol here for grading the answer; i've used suggestions from both chensu and nietod. If we should split the points, how do i do that? thanks,
ralph.
I've implemented a mix of your suggestions and it works fine.
So what is the protocol here for grading the answer; i've used suggestions from both chensu and nietod. If we should split the points, how do i do that? thanks,
ralph.
Unfortunately, you can't split points. They are considering it.
But your thanks is enough for me. (Although you could always send money <g>.)
But your thanks is enough for me. (Although you could always send money <g>.)
An aditional test might be to look to see if the last character is a CR (or possibly a CR/LF). Many ASCII files will end with a CR (or CR/LF). This depends on the program that produces the files, though.