Solved

How to convert japanese mbcs (multi byte character set) string to a unicode string

Posted on 2002-04-24
12
550 Views
Last Modified: 2007-12-19
I've tried to use wcstombs and the windows specific version WideCharToMultiByte to translate a unicode string (mbstowcs and MultiByteToWideChar to do the reverse) with no luck.  Any suggestions on how to convert from one to the other.  I'm loading a custom unicode resource file and must work with windows 98 (MBCS only).

-Dan
0
Comment
Question by:dwinkler
  • 5
  • 4
  • 2
  • +1
12 Comments
 
LVL 86

Expert Comment

by:jkr
ID: 6965754
You should use the UTF-8 codepage, e.g.

wchar_t* pwszJapanese = ...;


WideCharToMultiByte ( CP_UTF8,0,lpwsz, wcslen ( lpwsz) , <char_buffer>, <legth_of_char_buffer>, NULL, NULL);
0
 

Author Comment

by:dwinkler
ID: 6965812
Looking for a cross platform way to do it...  Any suggestions?
0
 
LVL 86

Expert Comment

by:jkr
ID: 6965827
What do you mean with "cross platform"?
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:dwinkler
ID: 6965874
a function that would work under unix, windows, mac (i.e. cross platform)
0
 
LVL 86

Expert Comment

by:jkr
ID: 6965900
You can specify the locale like

_wsetlocale( LC_ALL, L"Japanese.UTF8" );

before calling 'wcstombs()'.
0
 

Author Comment

by:dwinkler
ID: 6972577
Is there any way to get the current system locale so that it does not have to be changed for multiple versions of the software?  Why oh why can't the documentation be better than it is...
0
 

Author Comment

by:dwinkler
ID: 6978834
Ok I am having no luck on a Japanese machine using the following code:

size_t nSize = wcstombs( NULL, pszWide, NULL );

if( nSize != (size_t)-1 )
{
   m_pszShort = new char[nSize+1];

   m_pszShort[nSize] = '\0';

   size_t nCopied = wcstombs( m_pszShort, pszWide, nSize );
}

When I try to convert a legitimate Unicode character under the OS locale, it always returns -1.  I've am absolutely positive that the single unicode japanese character in the pszWide string is valid (if I save the text file as html it replaces it with the decimal equiv, which I verified is the value in pszWide).

Is the wcstombs function broke?  Or is there something I am missing?

0
 

Author Comment

by:dwinkler
ID: 6981738
Figured it out (kind of).  Points go to the person who can answer the new questions.  You must call setlocale(...) in each exe and dll.  Does anyone know why?  Shouldn't the dll and exe be in the same address space with the same global variables?  So you can either place the setlocale in either a function of the dll that is called or in a callback from the exe:

i.e. if you want to call sprintf( mbcsbuf, "%ls", widebuf ); inside of the dll and get the correct result.

A - application
B - dll

A->setlocale( LC_ALL, "" ); (does not work)

A->B->setlocale( LC_ALL, "" ); (works)

A->B->A->setlocale( LC_ALL, "" ); (works)

How does it switch context between the dll globals and the exes?
0
 
LVL 86

Accepted Solution

by:
jkr earned 500 total points
ID: 6981805
>>Does anyone know why?

This applies onyl for DLLs that don't use the CRT as a DLL (IOW: are not using msvcrt.dll). So, the CRT data structures (including locales) are differnet for each DLL. To avoid that, go to the project settings, "C++", 'Code Generation", select "Use Runtime Library: Multithreaded DLL" for each DLL.
0
 
LVL 6

Expert Comment

by:Mindphaser
ID: 7036254
Please update and finalize this old, open question. Please:

1) Award points ... if you need Moderator assistance to split points, comment here with details please or advise us in Community Support with a zero point question and this question link.
2) Ask us to delete it if it has no value to you or others
3) Ask for a refund so that we can move it to our PAQ at zero points if it did not help you but may help others.

EXPERT INPUT WITH CLOSING RECOMMENDATIONS IS APPRECIATED IF ASKER DOES NOT RESPOND.

Thanks,

** Mindphaser - Community Support Moderator **

P.S.  Click your Member Profile, choose View Question History to go through all your open and locked questions to update them.
0
 
LVL 49

Expert Comment

by:DanRollins
ID: 7043895
I recommend that points go to jkr for answering the follow-up question correctly.
-- Dan
0
 
LVL 6

Expert Comment

by:Mindphaser
ID: 7123101
Force accepted

** Mindphaser - Community Support Moderator **
0

Featured Post

Secure Your Active Directory - April 20, 2017

Active Directory plays a critical role in your company’s IT infrastructure and keeping it secure in today’s hacker-infested world is a must.
Microsoft published 300+ pages of guidance, but who has the time, money, and resources to implement? Register now to find an easier way.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

This article shows you how to optimize memory allocations in C++ using placement new. Applicable especially to usecases dealing with creation of large number of objects. A brief on problem: Lets take example problem for simplicity: - I have a G…
Go is an acronym of golang, is a programming language developed Google in 2007. Go is a new language that is mostly in the C family, with significant input from Pascal/Modula/Oberon family. Hence Go arisen as low-level language with fast compilation…
The viewer will learn additional member functions of the vector class. Specifically, the capacity and swap member functions will be introduced.
The viewer will learn how to clear a vector as well as how to detect empty vectors in C++.

685 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question