Solved

How do I define Unicode support for just a small segmant of code?

Posted on 2004-08-12
8
200 Views
Last Modified: 2010-04-15
How do I define Unicode support just for a small segmant of code?

I have an array of type char.  When the array is filled with single byte characters and then later output the contents to the console all appears fine (human readable).  

But what if the array is filled with Unicode (double byte) characters...the output is garbage...ok...this obviosly makes sense.  So.....is there then a way to output the correct Unicode character onto the screen?

What I"m trying to do is avoid changing the whole application to support Unicode for just that tiny tiny segmant.

Thanks
0
Comment
Question by:zarrona
  • 4
  • 4
8 Comments
 
LVL 7

Expert Comment

by:jimwasson
ID: 11803118
You have a couple of options. If you are using TCHAR for your strings, you can define unicode inline:

some code
...
#define _UNICODE

code that required unicode characters

#undef _UNICODE

more code

You can also specifically call the unicode version of a function. For instance, sprintf() has several
versions:
  sprintf() is the ASCII version while swprintf() is the wide (unicode) version.
  strlen() is the ASCII version while wcslen() is the unicode version.

0
 

Author Comment

by:zarrona
ID: 11815447
Hi,

Doesn't seem to work.....

All TCHAR defined between the _UNICODE definitions seem to be 1 byte rather than 2 bytes.

Must there be something else I'm missing?

thanks
0
 
LVL 7

Accepted Solution

by:
jimwasson earned 250 total points
ID: 11815712
Hmmm. Looks like I led you astray. Sorry.

You can use WCHAR in place of TCHAR where you want unicode characters and strings, i.e.,

TCHAR tch[] = _T("Hello there"); // tch is a char array.
WCHAR tch1[] = L"Hello there";   // tch1 is a unicode (wide char) array.
0
 

Author Comment

by:zarrona
ID: 11816772
ok...

Here is the situation....
I have a file named    Ωωθε.rtf     (Notice Greek symbols are used)
//unicode translation Ox03A9 Ox03C9 Ox03B8 Ox03B5 Ox002E Ox0072 Ox0074 Ox0066

    char nameBuffer[256];
//I then read the file name into a char buffer
    int bytesReadFromFile = read (FilesetContainerFile, &nameBuffer, htonl(oneFile.fNameLength));
//bytesReadFromFile = 16   ....sounds good  8 characters multiplied by 2 bytes each = 16
//oddly while debug nameBuffer appear in memory as          .©.É.¸.µ...r.t.f  
//unicode translation Ox00A9 Ox00C9 Ox00B8 Ox00B5 Ox002E Ox0072 Ox0074 Ox0066
//oddly it seems that every other byte is skipped and translated

How do I know create a CString that translates the buffer into it's Unicode equivelant?  
           
thanks
0
6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

 
LVL 7

Expert Comment

by:jimwasson
ID: 11820718
CString? What exactly are you using for CStrings?
0
 

Author Comment

by:zarrona
ID: 11822808
I need to pass a CString with the file name (in unicode) to an outside module.

I had set my project settings to "Use MFC in a Shared DLL"....that way I can use CStrings.
I have found a way to build a CString but it's not so correct.

CString myName;
myName.Empty();
for (unsigned int j=0; j<strlen(nameBuffer); j +2)
      myName += char(nameBuffer[j+1]);

Anyhow....I need to get _UNICODE defined in code for this area but I'm not able to get it working....All characters between _UNICODE defintions.  That way I can use TCHAR or WCHAR.

thanks
0
 
LVL 7

Expert Comment

by:jimwasson
ID: 11823154
This looks complicated.

To set the CString from your buffer you can just do this:
CString myName = nameBuffer;

It seems to mix Unicode and MBCS you have to handle the exception on a case-by-case basis.
0
 

Author Comment

by:zarrona
ID: 11826989
here is a code snippet....


#define _UNICODE

TCHAR unameBuffer[256];
bytesReadFromFile = read (FilesetContainerFile, &unameBuffer, htonl(oneFile.fNameLength));  //byteReadFromFile = 16...good

CString myName, hold;
TCHAR holdName;
      
myName.Empty();          
for (unsigned int j=0; j<bytesReadFromFile/2; j ++)
{
      holdName = htons((TCHAR)(unameBuffer[j]));  //holdName= 0x03A9  (Omaga character)...good so far
      CString hold(holdName);                                 //hold = "©|" = 0xA903
      myName += hold;                                          //myName = "©É¸µ.rtf" at end of for-loop
                                                      //NOT 0x03A9 (Omega) 0x03C9 (Omega) 0x03B8 (Theta) 0x03B5 (Epsilon).rtf
}

#undef _UNICODE

Close but not quite....it seems that the first byte is only read from a double-byte TCHAR when creating a CString.
What do you think?

thanks
0

Featured Post

What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

Join & Write a Comment

Summary: This tutorial covers some basics of pointer, pointer arithmetic and function pointer. What is a pointer: A pointer is a variable which holds an address. This address might be address of another variable/address of devices/address of fu…
Windows programmers of the C/C++ variety, how many of you realise that since Window 9x Microsoft has been lying to you about what constitutes Unicode (http://en.wikipedia.org/wiki/Unicode)? They will have you believe that Unicode requires you to use…
The goal of this video is to provide viewers with basic examples to understand and use structures in the C programming language.
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use while-loops in the C programming language.

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now