Solved

How to convert japanese mbcs (multi byte character set) string to a unicode string

Posted on 2002-04-24
12
536 Views
Last Modified: 2007-12-19
I've tried to use wcstombs and the windows specific version WideCharToMultiByte to translate a unicode string (mbstowcs and MultiByteToWideChar to do the reverse) with no luck.  Any suggestions on how to convert from one to the other.  I'm loading a custom unicode resource file and must work with windows 98 (MBCS only).

-Dan
0
Comment
Question by:dwinkler
  • 5
  • 4
  • 2
  • +1
12 Comments
 
LVL 86

Expert Comment

by:jkr
Comment Utility
You should use the UTF-8 codepage, e.g.

wchar_t* pwszJapanese = ...;


WideCharToMultiByte ( CP_UTF8,0,lpwsz, wcslen ( lpwsz) , <char_buffer>, <legth_of_char_buffer>, NULL, NULL);
0
 

Author Comment

by:dwinkler
Comment Utility
Looking for a cross platform way to do it...  Any suggestions?
0
 
LVL 86

Expert Comment

by:jkr
Comment Utility
What do you mean with "cross platform"?
0
 

Author Comment

by:dwinkler
Comment Utility
a function that would work under unix, windows, mac (i.e. cross platform)
0
 
LVL 86

Expert Comment

by:jkr
Comment Utility
You can specify the locale like

_wsetlocale( LC_ALL, L"Japanese.UTF8" );

before calling 'wcstombs()'.
0
 

Author Comment

by:dwinkler
Comment Utility
Is there any way to get the current system locale so that it does not have to be changed for multiple versions of the software?  Why oh why can't the documentation be better than it is...
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 

Author Comment

by:dwinkler
Comment Utility
Ok I am having no luck on a Japanese machine using the following code:

size_t nSize = wcstombs( NULL, pszWide, NULL );

if( nSize != (size_t)-1 )
{
   m_pszShort = new char[nSize+1];

   m_pszShort[nSize] = '\0';

   size_t nCopied = wcstombs( m_pszShort, pszWide, nSize );
}

When I try to convert a legitimate Unicode character under the OS locale, it always returns -1.  I've am absolutely positive that the single unicode japanese character in the pszWide string is valid (if I save the text file as html it replaces it with the decimal equiv, which I verified is the value in pszWide).

Is the wcstombs function broke?  Or is there something I am missing?

0
 

Author Comment

by:dwinkler
Comment Utility
Figured it out (kind of).  Points go to the person who can answer the new questions.  You must call setlocale(...) in each exe and dll.  Does anyone know why?  Shouldn't the dll and exe be in the same address space with the same global variables?  So you can either place the setlocale in either a function of the dll that is called or in a callback from the exe:

i.e. if you want to call sprintf( mbcsbuf, "%ls", widebuf ); inside of the dll and get the correct result.

A - application
B - dll

A->setlocale( LC_ALL, "" ); (does not work)

A->B->setlocale( LC_ALL, "" ); (works)

A->B->A->setlocale( LC_ALL, "" ); (works)

How does it switch context between the dll globals and the exes?
0
 
LVL 86

Accepted Solution

by:
jkr earned 500 total points
Comment Utility
>>Does anyone know why?

This applies onyl for DLLs that don't use the CRT as a DLL (IOW: are not using msvcrt.dll). So, the CRT data structures (including locales) are differnet for each DLL. To avoid that, go to the project settings, "C++", 'Code Generation", select "Use Runtime Library: Multithreaded DLL" for each DLL.
0
 
LVL 6

Expert Comment

by:Mindphaser
Comment Utility
Please update and finalize this old, open question. Please:

1) Award points ... if you need Moderator assistance to split points, comment here with details please or advise us in Community Support with a zero point question and this question link.
2) Ask us to delete it if it has no value to you or others
3) Ask for a refund so that we can move it to our PAQ at zero points if it did not help you but may help others.

EXPERT INPUT WITH CLOSING RECOMMENDATIONS IS APPRECIATED IF ASKER DOES NOT RESPOND.

Thanks,

** Mindphaser - Community Support Moderator **

P.S.  Click your Member Profile, choose View Question History to go through all your open and locked questions to update them.
0
 
LVL 49

Expert Comment

by:DanRollins
Comment Utility
I recommend that points go to jkr for answering the follow-up question correctly.
-- Dan
0
 
LVL 6

Expert Comment

by:Mindphaser
Comment Utility
Force accepted

** Mindphaser - Community Support Moderator **
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

Unlike C#, C++ doesn't have native support for sealing classes (so they cannot be sub-classed). At the cost of a virtual base class pointer it is possible to implement a pseudo sealing mechanism The trick is to virtually inherit from a base class…
Basic understanding on "OO- Object Orientation" is needed for designing a logical solution to solve a problem. Basic OOAD is a prerequisite for a coder to ensure that they follow the basic design of OO. This would help developers to understand the b…
The viewer will learn how to pass data into a function in C++. This is one step further in using functions. Instead of only printing text onto the console, the function will be able to perform calculations with argumentents given by the user.
The viewer will learn how to use the return statement in functions in C++. The video will also teach the user how to pass data to a function and have the function return data back for further processing.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now