Unicode, UTF's, wchar_t and C programming
Posted on 2006-06-09
I have a few questions about unicode and the utfs that I wonder any of you could answer.
I think understand the differences between unicode and the different encoding methods for it, but I haven't really grasped how everything is related.
For example, how do I go about making my program encode wide char strings as UTF16 instead of UTF 8 like it seems to be doing by default (gcc 3.4.6)? It seems my program running in windows, compiled with visual studio .net 2002, encodes its wide strings as UTF16 or UCS2, so how do I go about making it use UTF8 or any of the other encoding methods?
When do I need to use the functions declared in wchar.h in place of the normal string.h functions ( eg wcslen() vs strlen() )? I've seen code examples where regular old printf() and strcpy() were used with wide strings. When do I need to use their wide string equivalent?
I don't think I fully understand the relationship between locales and UTF8, either. Can someone explain it to me?
I'm sorry for the laundry list of questions. I'm very new to unicode and I can tell that it's a subject every programmer should be confident with, and seems to be a fairly complex subject.. I'd really appreciate any help given