• C

find if a text file is UNICODE or ASCI

I have to scan a text file, but I don't
know in advance if the file is UNICODE
Is there a way to knowing it (from
I mean, maybe UNICODE files have an
header or something similar...
If Notepad for Win NT can find it, there
must be a way...
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Load the file into memory and use the Win32 API 'IsTextUnicode()' (from the docs):

DWORD IsTextUnicode( CONST LPVOID lpBuffer,
 // pointer to an input buffer to be examined
int cb,
 // the size in bytes of the input buffer
 // pointer to flags that condition text examination and receive results
The IsTextUnicode function determines whether a buffer probably contains a form of Unicode text. The function uses various statistical and deterministic methods to make its determination, under the control of flags passed via lpi. When the function returns, the results of such tests are reported via lpi. If all specified tests are passed, the function returns TRUE; otherwise, it returns FALSE.

If you don't want to load the whole file, use a reasonable amount of bytes, which must be dividable by 2.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
BTW, Just as an addition: UICODE files don't have special headers, they're just 2 bytes per character...
gpbaldazziAuthor Commented:
The IsTextUnicode API seems to be what I need.
I saved a text file as Unicode (with Notepad for NT) and the first two bytes of the file are FF and FE: maybe all Unicode files have this sort of header? or is just a Notepad feature? If you know something about this, please tell me!
Anyway, thanks for your answer.
gpbaldazziAuthor Commented:
Bad news: IsTextUnicode works only under WinNT or Win 2000 (does this really exist?), it doesn't works for
win 95/98...
Sorry, I assumed you were talking about NT. Using UNICODE on Win9x doesn't make much sense either, as most of the APIs aren't supported...
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.