Behaviour of WidechartoMultibyte api for different codepage

When i executed below code why i am getting '?' for the first case as i know for codepage 932, which also support line draw characters How this API deals with codepages , AFAIK it search and map the character in the codepage then returns position of character from the codepage.

    typedef struct dbcs {
    unsigned char HighByte;
    unsigned char LowByte;
    } DBCS;

static DBCS set[5] = {0x25,0x5D};
unsigned char array[2];

#include <windows.h>
#include <stdio.h>

int main()
{
   // printf("hello world");
    int str_size;
    LPCWSTR                 charpntr;
    LPSTR                   getcd;
    LPBOOL                  flg;
int i ;
array[0] = set[0].LowByte;
array[1] = set[0].HighByte;
charpntr = &array;
str_size = WideCharToMultiByte(932, 0, charpntr, 1, getcd, 2, NULL,NULL);
printf(" value of %u",getcd);
printf("number of bytes %d character is  %s",str_size,getcd);
printf("\n");


array[0] = set[0].LowByte;
array[1] = set[0].HighByte;
charpntr = &array;
str_size = WideCharToMultiByte(437, 0, charpntr, 1, getcd, 2, NULL,NULL);
printf(" value of %u",getcd);
printf("number of bytes %d character is  %s",str_size,getcd);
printf("\n");

}

Open in new window

Untitled.png
Kirshna ChadaAsked:
Who is Participating?
 
sarabandeCommented:
utf32 is full unicode and a superset of all other standardized character sets or codepages.

utf16 (which wrongly was called UNICODE by MS) is only the first 16-bit layer of UNICODE and the WideCharToMultiByte function uses two-byte characters (wide characters) as input. the output is a multibyte characterset or codepage (MBCS). multibyte means that some characters would have a two or multi-byte representation, i. e. some strings will have a sequence of single bytes where each byte represents exactly one character while the same string may have a sequence of up-to-4 bytes which together represent also 1 single character.  those characters can be parsed from strings because they have a so-called lead-byte which has a code that is not a valid character in the MBCS. the mightiest MBCS is UTF-8 has up to 4 bytes for a character. there is a mapping between utf-16 and utf-8 where (nearly) all utf-16 characters have an equivalent utf-8 code. as utf-8 contains more codes than utf-16, conversion from utf-8 to utf-16 may have a loss of data, what means that some characters cannot be converted and were left empty or be set to a character like ? as a substitute.

codepage 632 is an older japanese codepage and is not one of the supported codepages of the WideCharToMultiByte function (see https://msdn.microsoft.com/en-us/library/windows/desktop/dd317756(v=vs.85).aspx). as far as i could find out, CP 632 is a MBCS with a maximum of two bytes per character. hence it has much less characters as utf-16 and a loss of data would be probable even if the codepage would have been supported. i also have doubts that the character you were trying to create really "IS" the same character as 0x255D. for codepage 437 conversion is simple because code sequence 0x25 0x5D is used both in utf-16 as in CP437.

Sara
0
 
CEHJCommented:
You need to be certain that cmd.exe has a font active that contains those glyphs. Are you?
0
 
sarabandeCommented:
It was difficult to give an answer which not only explains why the function WideCharToMultiByte failed  but might show a way how to proceed. However, the codepage 632 wasn't supported by the WINAPI function. Moreover, there is little to no information to get about it at the web. Therefore it is no way out from this beside of not using the codepage and change to one which is better supported, for example a UNICODE based code page.

Sara
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.