• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 206
  • Last Modified:

find if a text file is UNICODE or ASCI

I have to scan a text file, but I don't
know in advance if the file is UNICODE
or ASCII.
Is there a way to knowing it (from
code)?
I mean, maybe UNICODE files have an
header or something similar...
If Notepad for Win NT can find it, there
must be a way...
Thanks
0
gpbaldazzi
Asked:
gpbaldazzi
  • 3
  • 2
1 Solution
 
jkrCommented:
Load the file into memory and use the Win32 API 'IsTextUnicode()' (from the docs):

DWORD IsTextUnicode( CONST LPVOID lpBuffer,
 // pointer to an input buffer to be examined
 
int cb,
 // the size in bytes of the input buffer
 
LPINT lpi
 // pointer to flags that condition text examination and receive results
 
);
 
The IsTextUnicode function determines whether a buffer probably contains a form of Unicode text. The function uses various statistical and deterministic methods to make its determination, under the control of flags passed via lpi. When the function returns, the results of such tests are reported via lpi. If all specified tests are passed, the function returns TRUE; otherwise, it returns FALSE.

If you don't want to load the whole file, use a reasonable amount of bytes, which must be dividable by 2.
0
 
jkrCommented:
BTW, Just as an addition: UICODE files don't have special headers, they're just 2 bytes per character...
0
 
gpbaldazziAuthor Commented:
The IsTextUnicode API seems to be what I need.
I saved a text file as Unicode (with Notepad for NT) and the first two bytes of the file are FF and FE: maybe all Unicode files have this sort of header? or is just a Notepad feature? If you know something about this, please tell me!
Anyway, thanks for your answer.
bye
GP
0
 
gpbaldazziAuthor Commented:
Bad news: IsTextUnicode works only under WinNT or Win 2000 (does this really exist?), it doesn't works for
win 95/98...
0
 
jkrCommented:
Sorry, I assumed you were talking about NT. Using UNICODE on Win9x doesn't make much sense either, as most of the APIs aren't supported...
0

Featured Post

SMB Security Just Got a Layer Stronger

WatchGuard acquires Percipient Networks to extend protection to the DNS layer, further increasing the value of Total Security Suite.  Learn more about what this means for you and how you can improve your security with WatchGuard today!

  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now