C++
--
Questions
--
Followers
Top Experts
I want to convert a string like "ûüâäç" etc to UTF8 encoding...
How can I do this with an easy C(++) function?
Zero AI Policy
We believe in human intelligence. Our moderation policy strictly prohibits the use of LLM content in our Q&A threads.
Do you want to convert ASCII to UTF8?
Or do you want to convert UNICODE to UTF8?
mbstowcs is more portable then MultiByteToWideChar, and should work on any C/C++ compliant compiler
http://www.axter.com/faq/topic.asp?TOPIC_ID=63&FORUM_ID=4&CAT_ID=9






EARN REWARDS FOR ASKING, ANSWERING, AND MORE.
Earn free swag for participating on the platform.
Use the MultiByteToWideChar to convert ASCII to UNICODE, and then use WideCharToMultiByte to conver from UNICODE to UTF8.
so it means convert the 'ü' character (char -4) to utf8 (-62 -81 if I remember correctly)?
[btw, is it logical I did see emails coming in with replies from you, but that I didn't see the posts itself?]
That works ... But :S It's so slow (I mean there SHOULD be something like a 3 lines function or so)

Get a FREE t-shirt when you ask your first question.
We believe in human intelligence. Our moderation policy strictly prohibits the use of LLM content in our Q&A threads.
You have to click on the link to Experts-Exchange, to see the reply.
>>so it means convert the 'ü' character (char -4) to utf8 (-62 -81 if I remember correctly)?
Did you try the functions I posted?
FYI:
'ü' is not an ASCII character.
Where are you getting this character from? How is it introduced into your code?
What do you mean it's slow?
How do you know it's slow?
Did you do a bench mark test?
Can you post your code?
the characters come to me via an ascii file...
I read line per line, parse it & then I convert for example the names of the people in it to UTF8-encoding... (actually all the non-numeric fields are being converted).
And then I need it to submit it to SQLite, which is compiled in UTF8-mode






EARN REWARDS FOR ASKING, ANSWERING, AND MORE.
Earn free swag for participating on the platform.
wchar_t * lijn2 = new wchar_t[strlen(lijn)+1]
MultiByteToWideChar(CP_ACP
delete [] lijn;
lijn = new char[wcslen(lijn2)*3+1] // ugly yes :p
WideCharToMultiByte(CP_UTF
--> was something like that ... already ditched it
(currently going via wxWindows methods)
wxString test( lijn, wxConvLibc );
test.mb_str( wxConvUTF8 );
works OK for me ... But this also is ways too slow :(
Again, how do you know it's slow?
Did you run any type of valid test to see if it is slow?
If so, please explain.
This method should not impact your code, since the real bottle neck will be in reading the file.
Do a test with the function calls, and compare it to running your code without the function calls. I would be very surprise if you could measure a significant difference.
Please post your method for testing speed.

Get a FREE t-shirt when you ask your first question.
We believe in human intelligence. Our moderation policy strictly prohibits the use of LLM content in our Q&A threads.
wxMessageBox( wxString::Format( "Time elapsed: %ldms", sw.Time() ) );
This stopwatch starts before the file being read in, and stops after the file is read in...
It takes +- 5.6s to read in the file via wxString, via the other calls it takes 7.2s ...
Not a huge difference, but I think the real bottleneck is when assigning the memory for the second string ...
wchar_t * lijn2 = new wchar_t[MAX_BUFFER_LENGTH]
MultiByteToWideChar(CP_ACP
WideCharToMultiByte(CP_UTF
==> 1200ms <-> 1300ms
*lijn2++ = (char)(192 + (((unsigned char)lijn[current_number])
*lijn2++ = (char)(128 + (((unsigned char)lijn[current_number])
==> 1046ms <-> 1000ms
wxString test( abuffer, wxConvLibc );
strcpy(abuffer, test.mb_str( wxConvUTF8 ) );
==> 1360ms <-> 2703ms






EARN REWARDS FOR ASKING, ANSWERING, AND MORE.
Earn free swag for participating on the platform.
C++
--
Questions
--
Followers
Top Experts
C++ is an intermediate-level general-purpose programming language, not to be confused with C or C#. It was developed as a set of extensions to the C programming language to improve type-safety and add support for automatic resource management, object-orientation, generic programming, and exception handling, among other features.