tphipps
asked on
TCHAR implementation for UNIX?
I need to get a Unix-compatible implementation of TCHAR. Microsoft Visual C++ provides a full TCHAR generic character implementatation. TCHAR datatypes are conditionally typedef'ed to wchar_t or char, depending on the setting of the -DUNICODE compiler directive. It deals with the following:
TCHAR -> wchar_t or char
LPTSTR -> LPWSTR or LPSTR
_tcslen -> wcslen or strlen
_T"abc" -> Unicode string "abc" or ANSI string "abc"
tprintf -> wprintf or printf
Is there a UNIX implementation of TCHAR? I thought it was part of the ANSI standard, and not just a Microsoft idea.
For example, I'd like to make this very simple program 100% portable between NT and UNIX, but retain it's ability to be compiled as implementing Unicode strings or non-Unicode strings based on the #defines at the top. This needs to be generic across multiple UNIX implementations (let's say Solaris and HP-UX). Is there a standard UNIX header that does this for me? Is there a 3rd party library that does this?
#define UNICODE
#define _UNICODE
#include <stdio.h>
#include <string.h>
#include <windows.h>
#include <tchar.h>
void main()
{
TCHAR test[20];
LPTSTR ptr;
int n;
_tcscpy(test,&_T("ABCDEFG" ));
ptr = test;
n=_tcslen(test);
_ftprintf(stdout, _T("Text is: %s and length is %d\n"),test,n);
return;
}
TCHAR -> wchar_t or char
LPTSTR -> LPWSTR or LPSTR
_tcslen -> wcslen or strlen
_T"abc" -> Unicode string "abc" or ANSI string "abc"
tprintf -> wprintf or printf
Is there a UNIX implementation of TCHAR? I thought it was part of the ANSI standard, and not just a Microsoft idea.
For example, I'd like to make this very simple program 100% portable between NT and UNIX, but retain it's ability to be compiled as implementing Unicode strings or non-Unicode strings based on the #defines at the top. This needs to be generic across multiple UNIX implementations (let's say Solaris and HP-UX). Is there a standard UNIX header that does this for me? Is there a 3rd party library that does this?
#define UNICODE
#define _UNICODE
#include <stdio.h>
#include <string.h>
#include <windows.h>
#include <tchar.h>
void main()
{
TCHAR test[20];
LPTSTR ptr;
int n;
_tcscpy(test,&_T("ABCDEFG"
ptr = test;
n=_tcslen(test);
_ftprintf(stdout, _T("Text is: %s and length is %d\n"),test,n);
return;
}
Are you asking about
#include <wchar.h>
?
#include <wchar.h>
?
ASKER
Thanks ozo, but <wchar.h> isn't what I'm looking for. It's the wide character support header for Unix, but I'm looking for a generic tchar implementation. This means that if I code something along the lines of this:
TCHAR abc[10]=_T"abcdefg";
int x;
x=_tcslen(abc);
and then #define UNICODE, it should come out as
wchar_t abc[10]=L"abcdefg";
int x
x=wcslen(abc);
but if I don't #define UNICODE then it should come out as
char abc[10]="abcdefg";
int x;
x=strlen(abc);
It's a pretty simple conversion, but a pain in the ass to write #define headers for given the sheer number of string functions, enclosed text etc. This is exactly what happens with Visual C++ 2.0 or greater under Win32. I know that the wchar_t/wcsxxx functions are standard, but how about tchar/tcsxxx?
TCHAR abc[10]=_T"abcdefg";
int x;
x=_tcslen(abc);
and then #define UNICODE, it should come out as
wchar_t abc[10]=L"abcdefg";
int x
x=wcslen(abc);
but if I don't #define UNICODE then it should come out as
char abc[10]="abcdefg";
int x;
x=strlen(abc);
It's a pretty simple conversion, but a pain in the ass to write #define headers for given the sheer number of string functions, enclosed text etc. This is exactly what happens with Visual C++ 2.0 or greater under Win32. I know that the wchar_t/wcsxxx functions are standard, but how about tchar/tcsxxx?
Look up "Using Generic-Text Mappings" in the MSDev Help, you'll find a lot these are Microsoft Extensions (I'll try and copy the topic but it probably won't format well).
This is not to say another compiler won't either (a) have it's own extensions, or (b) even have something the same as the MS extensions. But you can't rely on all standard compilers having this funcitonality.
(begin quote)
To simplify code development for various international markets, the Microsoft run-time library provides Microsoft-specific "generic-text" mappings for many data types, routines, and other objects. These mappings are defined in TCHAR.H. You can use these name mappings to write generic code that can be compiled for any of the three kinds of character sets: ASCII (SBCS), MBCS, or Unicode, depending on a manifest constant you define using a #define statement. Generic-text mappings are Microsoft extensions that are not ANSI compatible.
Preprocessor Directives for Generic-Text Mappings
#define
Compiled Version
Example
_UNICODE
Unicode (wide-character)
_tcsrev maps to _wcsrev
_MBCS
Multibyte-character
_tcsrev maps to _mbsrev
None (the default: neither _UNICODE nor _MBCS defined)
SBCS (ASCII)
_tcsrev maps to strrev
For example, the generic-text function _tcsrev, defined in TCHAR.H, maps to mbsrev if MBCS has been defined in your program, or to _wcsrev if _UNICODE has been defined. Otherwise _tcsrev maps to strrev.
The generic-text data type _TCHAR, also defined in TCHAR.H, maps to type char if _MBCS is defined, to type wchar_t if _UNICODE is defined, and to type char if neither constant is defined. Other data type mappings are provided in TCHAR.H for programming convenience, but _TCHAR is the type that is most useful.
Generic-Text Data Type Mappings
Generic-Text Data Type Name
SBCS (_UNICODE, _MBCS Not Defined)
_MBCS Defined
_UNICODE Defined
_TCHAR
char
char
wchar_t
_TINT
int
int
wint_t
_TSCHAR
signed char
signed char
wchar_t
_TUCHAR
unsigned char
unsigned char
wchar_t
_TXCHAR
char
unsigned char
wchar_t
_T or _TEXT
No effect (removed by preprocessor)
No effect (removed by preprocessor)
L (converts following character or string to its Unicode counterpart)
For a complete list of generic-text mappings of routines, variables, and other objects, see Appendix B, Generic-Text Mappings.
The following code fragments illustrate the use of _TCHAR and _tcsrev for mapping to the MBCS, Unicode, and SBCS models.
_TCHAR *RetVal, *szString;
RetVal = _tcsrev(szString);
If MBCS has been defined, the preprocessor maps the preceding fragment to the following code:
char *RetVal, *szString;
RetVal = _mbsrev(szString);
If _UNICODE has been defined, the preprocessor maps the same fragment to the following code:
wchar_t *RetVal, *szString;
RetVal = _wcsrev(szString);
If neither _MBCS nor _UNICODE has been defined, the preprocessor maps the fragment to single-byte ASCII code, as follows:
char *RetVal, *szString;
RetVal = strrev(szString);
Thus you can write, maintain, and compile a single source code file to run with routines that are specific to any of the three kinds of character sets.
See Also Generic-text mappings, Data type mappings, Constants and global variable mappings, Routine mappings, A sample generic-text propgram
END Microsoft Specific
(end quote)
This is not to say another compiler won't either (a) have it's own extensions, or (b) even have something the same as the MS extensions. But you can't rely on all standard compilers having this funcitonality.
(begin quote)
To simplify code development for various international markets, the Microsoft run-time library provides Microsoft-specific "generic-text" mappings for many data types, routines, and other objects. These mappings are defined in TCHAR.H. You can use these name mappings to write generic code that can be compiled for any of the three kinds of character sets: ASCII (SBCS), MBCS, or Unicode, depending on a manifest constant you define using a #define statement. Generic-text mappings are Microsoft extensions that are not ANSI compatible.
Preprocessor Directives for Generic-Text Mappings
#define
Compiled Version
Example
_UNICODE
Unicode (wide-character)
_tcsrev maps to _wcsrev
_MBCS
Multibyte-character
_tcsrev maps to _mbsrev
None (the default: neither _UNICODE nor _MBCS defined)
SBCS (ASCII)
_tcsrev maps to strrev
For example, the generic-text function _tcsrev, defined in TCHAR.H, maps to mbsrev if MBCS has been defined in your program, or to _wcsrev if _UNICODE has been defined. Otherwise _tcsrev maps to strrev.
The generic-text data type _TCHAR, also defined in TCHAR.H, maps to type char if _MBCS is defined, to type wchar_t if _UNICODE is defined, and to type char if neither constant is defined. Other data type mappings are provided in TCHAR.H for programming convenience, but _TCHAR is the type that is most useful.
Generic-Text Data Type Mappings
Generic-Text Data Type Name
SBCS (_UNICODE, _MBCS Not Defined)
_MBCS Defined
_UNICODE Defined
_TCHAR
char
char
wchar_t
_TINT
int
int
wint_t
_TSCHAR
signed char
signed char
wchar_t
_TUCHAR
unsigned char
unsigned char
wchar_t
_TXCHAR
char
unsigned char
wchar_t
_T or _TEXT
No effect (removed by preprocessor)
No effect (removed by preprocessor)
L (converts following character or string to its Unicode counterpart)
For a complete list of generic-text mappings of routines, variables, and other objects, see Appendix B, Generic-Text Mappings.
The following code fragments illustrate the use of _TCHAR and _tcsrev for mapping to the MBCS, Unicode, and SBCS models.
_TCHAR *RetVal, *szString;
RetVal = _tcsrev(szString);
If MBCS has been defined, the preprocessor maps the preceding fragment to the following code:
char *RetVal, *szString;
RetVal = _mbsrev(szString);
If _UNICODE has been defined, the preprocessor maps the same fragment to the following code:
wchar_t *RetVal, *szString;
RetVal = _wcsrev(szString);
If neither _MBCS nor _UNICODE has been defined, the preprocessor maps the fragment to single-byte ASCII code, as follows:
char *RetVal, *szString;
RetVal = strrev(szString);
Thus you can write, maintain, and compile a single source code file to run with routines that are specific to any of the three kinds of character sets.
See Also Generic-text mappings, Data type mappings, Constants and global variable mappings, Routine mappings, A sample generic-text propgram
END Microsoft Specific
(end quote)
ASKER
Answers2000, I think you have it. The key phrase was "Microsoft-specific". Looks like I'm going to need to write my own TCHAR.H equivalent for my Unix platforms. Re-sub this as an answer and you have the points. Thanks!
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER