Solved

Printing international characters in a console window with cout

Posted on 2004-04-28
12
1,030 Views
Last Modified: 2013-12-03
Hello,

I’m using Visual Studio 6.0 and XP. The layout of the keyboard is Azerty (Belgian-Dutch). The problem is this:

#include "stdafx.h"
#include <iostream.h>
void main()
{
    cout << "My name is Hélène\n";
}
//  output: My name is HÚlÞne

When I write this program, I get the wrong output on my console window, as you can see. Is it possible to use cout and get the correct output? It’s important, because here in Belgium we use a lot of accents.

This question is asked on a Dutch speaking forum but nobody seems to know an answer. The strange thing is that it works as expected with a compiler like gcc (Linux).

It’s possible to print all the ascii-characters with a program like this:

// Print extended ascii codes and characters
using namespace std;
#include <stdafx.h>
#include <stdio.h>

void main()
{
    for (int i = 128; i < 255; i++)  printf("ASCII character #%i = %c\n", i, i);
};

But this is a way to work around the problem, it doesn’t solve it. Of course, we have examined MSDN but with no success so far. There are only 2 fonts available in the console (Lucida Console and Mono), and both show the wrong characters.

So, how can we use cout and cin, and get the correct characters in a console window?

Thx
0
Comment
Question by:pongping
  • 7
  • 5
12 Comments
 
LVL 3

Expert Comment

by:akalmani
ID: 10940136
A quick look at the DOS commands gives me a hint that its not the program error but the DOS Command prompt console properties.

You need to set the properties of command prompt. When the DOS prompt is open.
1.Click on the properties of it and change the fonts to "Lucida console" from "Raster fonts"
2.type chcp and change the code page to 1252 or whatever code page is applicable for Belgium.

I gave a simple try to display spanish text "configuración" it just displayed it correctly.

Here is a link for codepage details
http://www.uwm.edu/cgi-bin/IMT/wwwman?topic=code_page(5)&msection=
0
 

Author Comment

by:pongping
ID: 10940745
When opening a dos-box (via run-cmd), and I do the things you said, there is no difference in my program. When I type in the dos-window “echo Hélène” it displays the correct characters. However, this is not true in the console-window.

The link you mention is very interesting, but I don’t really understand the information in it. I’ve tried this:

#pragma setlocale( "belgium")

#include "stdafx.h"
#include <iostream.h>
#include <locale.h>

void main()
{
      cout << "My name is Hélène\n";
      cout << "configuración\n";
      cin.get();
}
/*
      output:
            My name is HÚlÞne
            configuraci¾n
*/

As you can see, it doesn’t change the output. I think that the dos-box and the console-window are two different things (at least in XP).
0
 

Author Comment

by:pongping
ID: 10946173
In the meantime, we have discovered a solution that works:

#include <iostream>
#include <windows.h>
void main()
{
  char str[19];
  CharToOem("My name is Hélène\n", str);
  std::cout << str;
}  

But even this is a workaround. I still think there must be a setting that corrects the output for all the following cout-statements automatically. Maybe I need to use a different character set for the console window, but how do you do that?
0
 
LVL 3

Expert Comment

by:akalmani
ID: 10946734
Well I have a US English XP OS with Chinese and Japanese language packs installed. What I did was the following.

Created a Win32 console application which just outputs "configuración". Opened the DOS prompt. Checked that the default code page set is 437. I changed the code page using DOS command "chcp 1252" and after this I run the test.exe.
It just displayed.

Anyways if you have solved the problem its good to hear that.
0
 

Author Comment

by:pongping
ID: 10947130
Thx again, akalmani!

But… it doesn’t work here. I must be an idiot.

No matter which codepage I enter in the Dos prompt, my program doesn’t change its output: “configuración” still becomes “configuraci¾n”. I’ve tried the debug version and the release version. With the dos-box still on the screen, and without it. With the Visual Studio running, and without it. Any possible combinations, but it does continue displaying the wrong output.

I have the Dutch version of XP Prof, and Visual Studio 6.0 with SP6. But even the C++ version from .Net has the same problem (there are other people on our forum who are testing).

Of course I will let you hear if the problem is solved. I love to give my points away !

0
 
LVL 3

Expert Comment

by:akalmani
ID: 10947334
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_42ib.asp

The keyword in the above link is "the MS-DOS FAT file system uses the OEM character set."
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 

Author Comment

by:pongping
ID: 10947681
My partitions are NTFS. I checked it on a fat system, it made no difference. The code table for my country is 850. This is set everywhere on my system, as far as I know.
I found this in MSDN:

------------------------------------------------------------------------
Universal Character Names
The C++ standard defines 96 characters that are guaranteed available for all C++ source files:

26 lower-case letters: a b c d e f g h i j k l m n o p q r s t u v w x y z
26 upper-case letters: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
10 digits: 0 1 2 3 4 5 6 7 8 9
29 other graphical characters: _ { } [ ] # ( ) < > % : ; . ? * + - / ^ & | ~ ! = , \ " '
1 space character
4 other non-graphical control characters: horizontal tab, vertical tab, form feed, new line.
These characters collectively form the "basic source character set." Each basic character has some implementation-defined 8-bit encoding; on ASCII implementations, the encodings lie between 0 and hexadecimal 7F or decimal 127.

C++ source may also (non-portably) contain "extended characters" beyond the basic set. These characters may not have useful graphic representations or glyphs on your system, or the characters and their glyphs may not be portable across systems. To work around this limitation, the standard allows "universal character names:"

\uXXXX names the extended character whose encoding is hexadecimal XXXX.
\UXXXXXXXX names the extended character whose encoding is hexadecimal XXXXXXXX.
All universal names are based on ISO/IEC 10646, which currently specifies over 70,000 characters and their encodings.

Note Technically speaking, the C++ standard makes no mention of Unicode. Practically speaking, both ISO/IEC 10646 and Unicode define the same characters and reserve the same character encodings.
------------------------------------------------------------------------

Of course, this doesn’t explain why it works on your system and not on ours. Maybe the American programmers doesn’t love the Belgians, huh? :-)
 
As I said, this program works:

#include <iostream>
#include <windows.h>
void main()
{
  char str[19];
  CharToOem("My name is Hélène\n", str);
  std::cout << str;
}  

Is it possible to write a macro, which perform this sort of action when you simply write:
cout << “any text with accents” ; ?
Because I can’t, my question is really: Can somebody write such a macro for me? That would be a perfect solution!
0
 
LVL 3

Accepted Solution

by:
akalmani earned 250 total points
ID: 10948485
Override << operator of ostream

ostream & operator<<(ostream & os, LPCSTR & szText)
{
  TCHAR szDst[260] = _T("");
   CharToOem(szText, szDst);
   os << szDst;
   return os;
}

Hope this helps..
0
 

Author Comment

by:pongping
ID: 10948833
That is looking great. You really are an expert! Can you tell me which library’s I need? I tried like this:

#include <afx.h>
#include <iostream.h>
#include <windows.h>

//Override << operator of ostream
ostream & operator<<(ostream & os, LPCSTR & szText)
{
  TCHAR szDst[260] = _T("");
   CharToOem(szText, szDst);
   os << szDst;
   return os;
};

void main()
{
      cout << "My name is Hélène\n" << endl;
}


But I get the errors:
nafxcwd.lib(thrdcore.obj) : error LNK2001: unresolved external symbol __endthreadex
nafxcwd.lib(thrdcore.obj) : error LNK2001: unresolved external symbol __beginthreadex
MSDN isn’t really helping with these errors, it could mean about anything.
I'm sorry, akalmani, but I'm still a beginner, he...
0
 
LVL 3

Expert Comment

by:akalmani
ID: 10949578
0
 

Author Comment

by:pongping
ID: 10951143
With a little adjustment, it works perfectly!

This is the final version; I even didn’t have to bother with the recommendations of MS:

#include <iostream>
#include <windows.h>
#include <iostream.h>

//Override << operator of ostream
ostream& operator<<(ostream & os, char* szTekst)
{
   TCHAR szBron[260] = "";
   CharToOem(szTekst, szBron);
   printf(szBron);
   return os;
}
void main()
{
       cout << "hélène\n" << 4 << endl << "éèàçë" << endl;
}
/*
      hélène
      4
      éèàçë
*/


Thx, akalmani!
0
 

Author Comment

by:pongping
ID: 10951815
As we say in Dutch: de puntjes op de i!
This is a better version:

#include <windows.h>
#include <iostream.h>
#include <stdio.h>

//Override << operator of ostream
ostream& operator<<(ostream & os, char* szTekst)
{
   TCHAR szBron[260] = "";
   CharToOem(szTekst, szBron);
   printf(szBron);
   return os;
}

int main()
{
       cout << "hélène\n" << 4 << endl << "éèàçë" << endl;
       return 0;

}
/*      Output:
            hélène
            4
            éèàçë
*/
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

Introduction: Dynamic window placements and drawing on a form, simple usage of windows registry as a storage place for information. Continuing from the first article about sudoku.  There we have designed the application and put a lot of user int…
Exception Handling is in the core of any application that is able to dignify its name. In this article, I'll guide you through the process of writing a DRY (Don't Repeat Yourself) Exception Handling mechanism, using Aspect Oriented Programming.
This video will show you how to get GIT to work in Eclipse.   It will walk you through how to install the EGit plugin in eclipse and how to checkout an existing repository.
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now