Solved

Multibyte String byte difference

Posted on 2006-11-15
6
263 Views
Last Modified: 2010-04-15
Hi,
I am using utf-8 encoding for my application.

is_valid_name (char *name){
    char *cp;

    if (name == NULL || *name == '\0')
     return 0;

    for (cp = name; *cp != '\0'; cp++) {
          printf("%i", *cp);
    }
    return 1;
}

In the function above, name is composed of 1 Chinese character but 3 bytes.
So the strlen(name) = 3.

On windows, the char pointer *cp prints out 237, 138, 184 respectively.
On unix, *cp prints out -19, -118, -72 repectively for the same string.

I know that the negative values are the difference from 256.
Could you explain the byte difference ?
Why the different values for unix and windows?
Is the negative byte value only apply for multi byte characters?

Thanks
Jamie




0
Comment
Question by:jamie_lynn
  • 4
  • 2
6 Comments
 
LVL 16

Expert Comment

by:PaulCaswell
ID: 17951309
Hi jamie_lynn,

This can be compiler settings. With some compilers the default for char is unsigned, in others its signed.

Paul
0
 

Author Comment

by:jamie_lynn
ID: 17951556
Hi Paul,

What is a better way to validate the string?
Using unsigned char for the parameter or checking for negative?

Thanks
Jamie

i.e.
is_valid_name (unsigned char *name){
...
}

or

is_valid_name (char *name){
    char *cp;

    if (name == NULL || *name == '\0')
     return 0;

    for (cp = name; *cp != '\0'; cp++) {
          if (*cp < 0)  
                 continue;
          printf("%i", *cp);
    }
    return 1;
}

0
 
LVL 16

Accepted Solution

by:
PaulCaswell earned 500 total points
ID: 17951574
I'd leave the parameter as char so caller doesnt have to cast.

is_valid_name (char *name){
    unsigned char *cp;

...

    for (cp = (unsigned char *) name; *cp != '\0'; cp++) {
 
That way compiler settings wont change how your code works.

Paul
0
Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

 

Author Comment

by:jamie_lynn
ID: 17951627
Thanks Paul!
Jamie
0
 

Author Comment

by:jamie_lynn
ID: 17951781
Paul,

I casted name with unsigned char * but I still get negative byte value on unix....
What should I try next?

for (cp = (unsigned char *) name; *cp != '\0'; cp++) {
...
}

Thanks
Jamie
0
 

Author Comment

by:jamie_lynn
ID: 17951802
Ooops. This is my bad.
This works. I forgot to declare cp as unsigned char.
Thanks!
Jamie
0

Featured Post

Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
memory mapped I/O query 6 147
Problem to scan all sheets 3 104
How to creat good software interface ? 1 67
nested if statement in excel help 4 26
This is a short and sweet, but (hopefully) to the point article. There seems to be some fundamental misunderstanding about the function prototype for the "main" function in C and C++, more specifically what type this function should return. I see so…
Examines three attack vectors, specifically, the different types of malware used in malicious attacks, web application attacks, and finally, network based attacks.  Concludes by examining the means of securing and protecting critical systems and inf…
The goal of this video is to provide viewers with basic examples to understand and use structures in the C programming language.
The goal of this video is to provide viewers with basic examples to understand and use conditional statements in the C programming language.

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question