asked on

Recognize Chinese Multibyte Character

Hi

I'd like to ask any of you the method of recognizing Chinese character (a multibyte character) in a passage containing both Chinese and some single byte characters, such as English and numbers.

When I use a pointer, it only points the passage byte by byte and it is not able to detect whether it is a multibyte character or not.

Is there a way to:
1. Extract these Chinese characters from the passage OR
2. Intelligently pointing character by character (not matter the character is multibyte or single byte) OR
3. Convert all of them to multibyte characters?

Your suggestions will be much appreciated! Thanks!

pb_india

You can use, depending on your need :
1. wcsrtombcs(wchar_t*, char*, int); //wide to Multibyte

2. _mbbtombc //Convert 1-byte multibyte character to corresponding 2-byte multibyte character

happy_emily

ASKER

Can you show me some example programs demonstrating the use of these functions? (I am a newbie in C++ program)
Say for example, the passage is "abcdefXXXX23" where XXXX are the Chinese characters.

Thanks!

pb_india

Sure.

What exaclty you are trying to do. Just read these characters from a file or something and output it?
Or you just want to separate Chinese characters from English?

hellohelloworld

In fact, what I am trying to do is to count the number of occurrence of every character (Chinese character must be counted) appeared in the passage, which consists of different types of characters (ie. English + Chinese + Numbers).

What I can think of is using pointers to do so. However, I have encountered the problem mentioned...... So, I am pondering whether I should convert all the characters in the passage to be double-byte first and then increment the pointer by 2 everytime reading a character, or I should separate the multibyte characters (Chinese) from the singlebyte ones (English + Numbers) and then count them respectively.

Do you have any idea?

happy_emily

ASKER

PS whoops! hellohelloworld is my second account

ASKER CERTIFIED SOLUTION

pb_india

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial