Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1764
  • Last Modified:

How do I declare a literal unicode string for the Mac

How do I declare a literal unicode string?  For example, with Windows I would simply use:

WCHAR text[]      = L"This is a unicode string";

For the Mac, I would like to do something like

UniChar text[] = L"This is a unicode string";

Now I know about CFSTR and CFStringRef, and that is not what I am looking for.  Is there a way to declare non-encapsulated literal unicode strings?
0
JohnGaby
Asked:
JohnGaby
  • 2
  • 2
2 Solutions
 
evilrixSenior Software Engineer (Avast)Commented:
The c++ way (and therefore cross platform way) to declare a wide character string is thus: -

wchar_t text[] = L"This is a unicode string";


Please note, don't confuse Unicode (which is a character encoding scheme) with the ability to store Unicode. C++ knows nothing of Unicode, it just knows about narrow and wide chars (it is the OS that provides the ability to encode/decode Unicode). Declaring a wide string doesn't mean the contents are Unicode, it just means you have a storage vessel capable of storing Unicode in the appropriate Unicode Transformation Format. You can store Unicode in 8, 16 and 32 bit storage types, put this another way, you can store Unicode in a simple char type. There are 3 standard was of encoding Unicode, there are UTF8, UTF16 and UTF32. UTF8 is an 8 bit multi-byte encoding, UTF16 is a 16 bit multibyte encoding and UTF32 is a 32 bit fixed byte encoding. The one you choose depends upon the size of your storage vessel. For example, on Linux a wchar_t is a 32 bit type whereas on Windows it is a 16 bit type. Therefore on Linux the native encoding type of Unicode is normally UTF32, whereas on Windows it is UTF16. It is important to know this if you are using Unicode otherwise things may become a little confusing.

There are some good pages on Wikipedia that explain all about Unicode, UTF8/16/32.
0
 
JohnGabyAuthor Commented:
I am quite aware of the different flavors of unicode, and know that what is used it is OS dependent.  That is why I am asking questions about the Mac.  On the Mac, the basic string type is CFString, correct?  Now I can create this using a UTF8 string, or a UTF16 string (UniChar), but I don't see any way to create them using a UTF32 string.

I am a Windows programmer, so I am used to having my whole program use UTF16 strings everywhere.  I am also used to being able to index these strings, which I cannot do with CFString(?).  What I was wondering if there was a way to declare literal UTF16 (UniChar) strings for the Mac, or am I forced to use CFString everywhere?

OS X is Unix, so of course the type wchar_t is 32 bit, and therefore does not seem to be very useful in this context.
0
 
evilrixSenior Software Engineer (Avast)Commented:
>> I am quite aware of the different flavors of unicode
Good, just wanted to be sure we're all on the same page -- it seems we are :)

>> OS X is Unix, so of course the type wchar_t is 32 bit, and therefore does not seem to be very useful in this context.
We handle all strings internally as UTF8 as it's the simplest way to provide cross-platform portability. We convert, if necessary, at the interface boundary.

Sorry, I don't know Macs so I can't advise specifically.
0
 
JohnGabyAuthor Commented:
Well, for anyone else looking for an answer to this question, it looks like the short answer is "You can't".  For the Mac OS X the way you handle unicode strings is through the use of the CFString functions.
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

  • 2
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now