Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Recognize Chinese Multibyte Character

Posted on 2004-11-01
10
Medium Priority
?
1,213 Views
Last Modified: 2013-12-03
Hi

I'd like to ask any of you the method of recognizing Chinese character (a multibyte character) in a passage containing both Chinese and some single byte characters, such as English and numbers.

When I use a pointer, it only points the passage byte by byte and it is not able to detect whether it is a multibyte character or not.

Is there a way to:
1. Extract these Chinese characters from the passage OR
2. Intelligently pointing character by character (not matter the character is multibyte or single byte)  OR
3. Convert all of them to multibyte characters?

Your suggestions will be much appreciated! Thanks!
0
Comment
Question by:happy_emily
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
10 Comments
 
LVL 2

Expert Comment

by:pb_india
ID: 12463933
You can use, depending on your need :
1. wcsrtombcs(wchar_t*, char*, int); //wide to Multibyte

2. _mbbtombc //Convert 1-byte multibyte character to corresponding 2-byte multibyte character
0
 

Author Comment

by:happy_emily
ID: 12465255
Can you show me some example programs demonstrating the use of these functions? (I am a newbie in C++ program)
Say for example, the passage is "abcdefXXXX23" where XXXX are the Chinese characters.

Thanks!
0
 
LVL 2

Expert Comment

by:pb_india
ID: 12465735
Sure.

What exaclty you are trying to do. Just read these characters from a file or something and output it?
Or you just want to separate Chinese characters from English?
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Expert Comment

by:hellohelloworld
ID: 12466125
In fact, what I am trying to do is to count the number of occurrence of every character (Chinese character must be counted) appeared in the passage, which consists of different types of characters (ie. English + Chinese + Numbers).

What I can think of is using pointers to do so. However, I have encountered the problem mentioned...... So, I am pondering whether I should convert all the characters in the passage to be double-byte first and then increment the pointer by 2 everytime reading a character, or I should separate the multibyte characters (Chinese) from the singlebyte ones (English + Numbers) and then count them respectively.

Do you have any idea?
0
 

Author Comment

by:happy_emily
ID: 12466227
PS whoops! hellohelloworld is my second account
0
 
LVL 2

Accepted Solution

by:
pb_india earned 500 total points
ID: 12466233
Hi,

I think what you can do is:
COnvert all the characters from narrow to wide and do a byte comparison to count number of characters.

Use:
mbstate_t ps;
mbsrtowcs(wchar_t* wide,const char* narrow,  int len, mbstate_t* ps);

char* narrow will be your string from passage

and then use code with logic as following... (You wil need to modify it for your own use)
[I can develop the program for you, but 125 is too less for that much work.]


#include <iostream.h>
#include <fstream.h>

int main () {
  ifstream f1;
  char c;
  int numchars, numlines;

  f1.open("test");

  numchars = 0;
  numlines = 0;
  f1.get(c);
  while (f1) {
    while (f1 && c != '\n') {
      numchars = numchars + 1;
      f1.get(c);
    }
    numlines = numlines + 1;
    f1.get(c);
  }
  cout << "The file has " << numlines << " lines and " 
    << numchars << " characters" << endl;
  return(0);
}

0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction This article is the first in a series of articles about the C/C++ Visual Studio Express debugger.  It provides a quick start guide in using the debugger. Part 2 focuses on additional topics in breakpoints.  Lastly, Part 3 focuses on th…
Introduction This article is a continuation of the C/C++ Visual Studio Express debugger series. Part 1 provided a quick start guide in using the debugger. Part 2 focused on additional topics in breakpoints. As your assignments become a little more …
The goal of the video will be to teach the user the concept of local variables and scope. An example of a locally defined variable will be given as well as an explanation of what scope is in C++. The local variable and concept of scope will be relat…
The viewer will be introduced to the technique of using vectors in C++. The video will cover how to define a vector, store values in the vector and retrieve data from the values stored in the vector.

636 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question