Multi Byte MBCS vs. Wide Char
Posted on 2001-06-18
At the moment, when I suspect (tested or not) that input-text is unicode (wide char) I convert using the routine :
This way I seem to end up with a 1-byte/char string.
(for western language text e.g. Dutch, English, ...)
- What if the input is e.g. Chinese or Hebrew or ...
With what will I end up ?? Or will the function simply fail ?
- Suppose the function succeeds ... Do I end up with a string which (can) contain(s) 1 and/or 2 byte characters ?
- Suppose I end up with a string which contains 1 or 2 byte characters (I don't care as long as it's a valid widely-system-supported data) can this Multi-Byte string contain NULL characters ?
- In other words, can I use standard string functions such as strlen() which uses the NULL terminator and in case of Multi-Byte ... Is ONE NULL char enough as termination or should there be two NULL chars too like in Wide Char ?
- Finally, for those who are familiar with VCL & Borland cpp AnsiString's implementation : I think I understood from the documentation that the VCL component AnsiString stores strings as MBCS ... is this correct ?
Few questions in one but all so strongly related I felt I had to post in one question. Pls. try to 'touch' all sub-questions. But maybe I'm completely 'missing the ball' and are the sub-questions irrelevant ?