Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

How to represent a russian string ( unicode ) in a C string..

Posted on 2011-04-21
14
Medium Priority
?
493 Views
Last Modified: 2012-05-11
I need to store russian months in an array ?

I am working on an older embedded project and need to represent Russian months in c source code. With this processor the characters are 16bits so I can put unicoode characters in them.
But all the string functions are 8bits.  If I enter the unicode characters in the IDE I get ???? instead.

static const char*sEnglishNames[13] ={"","January","February","March","April","May",
"June","July","August","September","October","November", "December"};

Russian months

¿¿¿¿¿¿   -   January
¿¿¿¿¿¿¿   -   February
¿¿¿¿   -   March
¿¿¿¿¿¿   -   April
¿¿¿   -   May
¿¿¿¿   -   June
¿¿¿¿   -   July
¿¿¿¿¿¿   -   August
¿¿¿¿¿¿¿¿   -   September
¿¿¿¿¿¿¿   -   October
¿¿¿¿¿¿   -   November
¿¿¿¿¿¿¿   -   December
0
Comment
Question by:TremorBlue
  • 5
  • 3
  • 3
  • +2
14 Comments
 

Author Comment

by:TremorBlue
ID: 35439646
They were converted to ??? as well..  try here.

http://www.russianlessons.net/vocabulary/nouns_days.php
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 35439793
For Unicode support, it's best to use an external library, like ICU :

        http://site.icu-project.org/
0
 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 35439828
Works for me copy & pasting the Russian names.  Note however, that your unicode strings are going to be wchar_t* and not char*, unicode string literals are prefixed with "L", and you need to make sure your editor will save your source code file as unicode.

wchar_t *sRussianMonths[3] = { L"¿¿¿¿¿¿", L"¿¿¿¿¿¿¿", L"¿¿¿¿"};

Open in new window

0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 35439832
Well, I guess EE doesn't support Unicode strings. ;)

Maybe if I don't put it in a code block...

wchar_t *sRussianMonths[3] = { L"¿¿¿¿¿¿", L"¿¿¿¿¿¿¿", L"¿¿¿¿"};
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 35439893
wchar_t is very platform dependent, and is not guaranteed to be able to store the Unicode character you try to put in it. So, it is not recommended to use it if you need Unicode support in your application - especially not if you'd like that Unicode support to be portable.

Instead, use an external library like the one I suggested earlier.
0
 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 35439915
Agreed, http:#a35439793 is the better approach - I had just left the window sitting open for a few minutes and so didn't see your comment before I hit submit. ;)
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 35439924
An open window is good !! Brings in fresh air :)
0
 

Author Comment

by:TremorBlue
ID: 35440055
Yes but forget about experts exchange I don't mind that it doesnt work there.

 I want to know how to do unicode strings inside an old ccompiler.

 I was thinking about just using the unicode numbers .. ?
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 35440115
Please refer to my first reply ;)
0
 
LVL 35

Expert Comment

by:sarabande
ID: 35440570
how old is the c compiler?

can you check whether the below compiles

#include <stdio.h>

int main()
{
    wchar_t * pws = L"January";
    int szwc = sizeof(wchar_t);
    wprintf(L"%s %d", pws, szwc);    
    return 0;
}

Open in new window


and if yes tell whether it prints correctly?

Sara
0
 

Author Comment

by:TremorBlue
ID: 35446508
The compiler does not support wchar_t.

I don't need to have a library like ICU.

As my question says I just need to store the unicode characters in an array.

It looks like I have to store the characters in an array of shorts.
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 35446554
>> As my question says I just need to store the unicode characters in an array.

So you don't ever want to display them ?
Can you be sure that all unicode characters you'll ever use will be (exactly) 16 bits wide ?
0
 
LVL 7

Accepted Solution

by:
JimBeveridge earned 2000 total points
ID: 35461621
Characters are just numbers - even Unicode characters. Write the Unicode file in your favorite editor. Put a a space between each month. For example, in English:
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Write a short application that opens the file and spits out the raw binary as C bytes. For example, the output for the English above would be:


unsigned char months[] = {
0x4A, 0x00, 0x61, 0x00, 0x6E, 0x00, 0x20, 0x00, 0x46, 0x00, 0x65, 0x00, 0x62, 0x00, 0x20, 0x00
0x4D, 0x00, 0x61, 0x00, 0x72, 0x00, 0x20, 0x00, 0x41, 0x00, 0x70, 0x00, 0x72, 0x00, 0x20, 0x00
0x4D, 0x00, 0x61, 0x00, 0x79, 0x00, 0x20, 0x00, 0x4A, 0x00, 0x75, 0x00, 0x6E, 0x00, 0x20, 0x00
0x4A, 0x00, 0x75, 0x00, 0x6C, 0x00, 0x20, 0x00, 0x41, 0x00, 0x75, 0x00, 0x67, 0x00, 0x20, 0x00
0x53, 0x00, 0x65, 0x00, 0x70, 0x00, 0x20, 0x00, 0x4F, 0x00, 0x63, 0x00, 0x74, 0x00, 0x20, 0x00
4E, 0x00, 0x6F, 0x00, 0x76, 0x00, 0x20, 0x00, 0x44, 0x00, 0x65, 0x00, 0x63, 0x00
};

Open in new window

0
 
LVL 7

Expert Comment

by:JimBeveridge
ID: 35461659
Forgot the final instructions - create an array of points that are indexed into the array above. This can be done automatically at runtime by searching for the spaces, which you'll then want to turn into zeros so that the strings are null terminated. (Also, my sample array needs ,0x00,0x00 at the end to terminate it properly.)

One thing you have to be careful of is that you have to read the file and write the result as bytes, not as shorts, otherwise you'll have problems on little-endian architectures.
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This tutorial is posted by Aaron Wojnowski, administrator at SDKExpert.net.  To view more iPhone tutorials, visit www.sdkexpert.net. This is a very simple tutorial on finding the user's current location easily. In this tutorial, you will learn ho…
Examines three attack vectors, specifically, the different types of malware used in malicious attacks, web application attacks, and finally, network based attacks.  Concludes by examining the means of securing and protecting critical systems and inf…
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use for-loops in the C programming language.
The goal of this video is to provide viewers with basic examples to understand and use switch statements in the C programming language.
Suggested Courses

578 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question