Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Multibyte String byte difference

Posted on 2006-11-15
6
Medium Priority
?
269 Views
Last Modified: 2010-04-15
Hi,
I am using utf-8 encoding for my application.

is_valid_name (char *name){
    char *cp;

    if (name == NULL || *name == '\0')
     return 0;

    for (cp = name; *cp != '\0'; cp++) {
          printf("%i", *cp);
    }
    return 1;
}

In the function above, name is composed of 1 Chinese character but 3 bytes.
So the strlen(name) = 3.

On windows, the char pointer *cp prints out 237, 138, 184 respectively.
On unix, *cp prints out -19, -118, -72 repectively for the same string.

I know that the negative values are the difference from 256.
Could you explain the byte difference ?
Why the different values for unix and windows?
Is the negative byte value only apply for multi byte characters?

Thanks
Jamie




0
Comment
Question by:jamie_lynn
  • 4
  • 2
6 Comments
 
LVL 16

Expert Comment

by:PaulCaswell
ID: 17951309
Hi jamie_lynn,

This can be compiler settings. With some compilers the default for char is unsigned, in others its signed.

Paul
0
 

Author Comment

by:jamie_lynn
ID: 17951556
Hi Paul,

What is a better way to validate the string?
Using unsigned char for the parameter or checking for negative?

Thanks
Jamie

i.e.
is_valid_name (unsigned char *name){
...
}

or

is_valid_name (char *name){
    char *cp;

    if (name == NULL || *name == '\0')
     return 0;

    for (cp = name; *cp != '\0'; cp++) {
          if (*cp < 0)  
                 continue;
          printf("%i", *cp);
    }
    return 1;
}

0
 
LVL 16

Accepted Solution

by:
PaulCaswell earned 2000 total points
ID: 17951574
I'd leave the parameter as char so caller doesnt have to cast.

is_valid_name (char *name){
    unsigned char *cp;

...

    for (cp = (unsigned char *) name; *cp != '\0'; cp++) {
 
That way compiler settings wont change how your code works.

Paul
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 

Author Comment

by:jamie_lynn
ID: 17951627
Thanks Paul!
Jamie
0
 

Author Comment

by:jamie_lynn
ID: 17951781
Paul,

I casted name with unsigned char * but I still get negative byte value on unix....
What should I try next?

for (cp = (unsigned char *) name; *cp != '\0'; cp++) {
...
}

Thanks
Jamie
0
 

Author Comment

by:jamie_lynn
ID: 17951802
Ooops. This is my bad.
This works. I forgot to declare cp as unsigned char.
Thanks!
Jamie
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Preface I don't like visual development tools that are supposed to write a program for me. Even if it is Xcode and I can use Interface Builder. Yes, it is a perfect tool and has helped me a lot, mainly, in the beginning, when my programs were small…
Examines three attack vectors, specifically, the different types of malware used in malicious attacks, web application attacks, and finally, network based attacks.  Concludes by examining the means of securing and protecting critical systems and inf…
The goal of this video is to provide viewers with basic examples to understand opening and writing to files in the C programming language.
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use nested-loops in the C programming language.
Suggested Courses

916 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question