How to read Unicode files in Visual C++ Multibyte Application
Posted on 2010-11-12
An MFC application using Multibyte Character Set cannot read Chinese (PRC) Unicode files created by C# .NET.
It can read legacy files which do not have BOM and are MBCS. The Unicode file begins with BOM FF FE. "Male" is stored as 37 75 in the Unicode file but loads as 0xe7 'ç' 0x94 '”' 0xb7 '·'
The Multibyte file stores "Male" as C4 'Ä' D0 0xd0 'Ð'.
What's the best way to read the Unicode files if the application is MFC Visual C++ using MBCS?
1. Convert the unicode string to MBCS when writing the file in C#?
2. Modify the C++ app to correctly read the Unicode files?
3. Create another C++ app in Unicode to read and convert these files?
I have tried in Visual C++ ismbblead, setLocale, CFile, fopen, _open, and C# FileStream. No matter what I try, I can never get the hex bytes as they are stored inside the file. I always get the bytes encoded. If the file format doesn't match the app format, I'm stuck. This is my current code in the multibyte C++ app:
CString pathName = fileDlg.GetPathName();
//char *pLocale = setlocale(LC_CTYPE, "zh-hk"); //has no effect on encoding
//_setmbcp(_MB_CP_LOCALE); //has no effect on encoding
FILE *fh = fopen(pathName, "rb");
const int MAX_COUNT = 100;
memset(buffer, 0, MAX_COUNT);
fgets(buffer, MAX_COUNT, fh); //Male
And this is code in C# .NET test app that reads Unicode but not MBCS
using (StreamReader sr = new StreamReader(vpdName))
int lineIndex = 0;
while (sr.Peek() >= 0)
string str = sr.ReadLine();
This is tough! I've worked on it for 3 days and spent lots of hours searching this forum and others for help on this problem. My goal is to be able to read the Unicode file and convert the Chinese strings so that they will display properly in a multibyte app. I think this means that I need to convert Unicode 0x75 37 to MBCS 0x C4 D0. Can this be done? But first I need to get that Unicode string! And the multibyte app always reads and encodes the Unicode file so that the strings are garbage--don't display properly and cannot be converted.