Link to home
Start Free TrialLog in
Avatar of amitd
amitd

asked on

Changing the Character symbol mapping.

Can the characters having ANSI value more than 127 mapped to different character symbols? If yes, how does one do it?
Is there some thing called character tables and are there more than one character table defined by Windows OS / Hardware? If, Yes how does one switch to different Character table.
Avatar of CJ_S
CJ_S
Flag of Netherlands image

You call it a character table, while it is in fact a code page.

A system can have several code pages installed.... setting them depends on whether the language supports it. What programming language are you using?

regards,
CJ
Avatar of Neutron
Neutron

Can you give more details on what you're trying to do?

Greetings,
    Ntr:)
Avatar of amitd

ASKER

CJ,

I am using VC++.

How will I set the code page?

Regards,

amit
I have an MSDN article here, if you want I can send it to you. it's too much to show here, but it contains exactly the info you would like. One thing in advance. the code page / character set is for the whole system. The only thing you can actually do is map the characters entered to another code page.

What is your email?

(I couldn't find the article at MSDN online...so...)

regards,
CJ
Avatar of amitd

ASKER

My email Id is shuklas@mahindrabt.com.

Regards,

amit
Email sent
Note, code page is term for Microsoft, which varies in interpretation (ex OS).

For character set, whether extended or expanded, implementation really depends on the receiver platform. For example, uses differ between modem, monitor, and printer varieties.

Likely, for VC the article of MSDN may satisfy, but I have no details on it, although I did try access through latest VS as well.
Avatar of amitd

ASKER

Below I have given details of what I am doing I hope this will make my requirement clear.

I have taken HP DeskJet printer driver sample from Win 98 DDK. This sample has a file Minidrv.c in which the function ExtTextOut (Export Ordinal 14) has various parameters. One of them is ?lpStr? (Fifth parameter) that gives me character string to be printed. Each character is represented by a Two-byte code. It is mentioned in the ?Graphic Device Interface Reference? document, provided along with 98 DDK, that the two-bye code is a ?Glyph index? into the character-offset table.

Interestingly the difference between Glyph indexes and ANSI code for characters between 32 to 126 inclusive is 29.
Example: For ?SPACE?, the Glyph index is 3 whereas in ANSI, it is 32. This is not true for characters greater then 126. There is no consistency for characters greater than 126.

I want to get ANSI / UNICODE from the glyph indexes. Please help me out with this. It?s really urgent

These numerous parameters which you mentioned, is one of them name of the font which is being used?

Some comments:
- one font can contain several character tables
- for each character in a table, font selects a Glyph which represents it
- some of these character tables can (and usually do) contain characters which look just the same.
- characters from one or different tables can share the same Glyph.
- font can contain Glyphs that are not used in any character table.

This means that font designer can do whatever he wishes, he can put Glyphs in different order, he can have duplicates, unused Glyphs or even missing Glyphs.

Later, in definition of one of character tables, he can select Glyphs in the correct order, thus forming a table where he places Glyph index at the n-th position in the character table.

For the TrueType font format, those tables can be located inside a file.

If you know the font file name or at least the font name.

If you know the font name (that's why I was asking) you can examine TTF header and locate US ASCII standard character table which contains Glyph index for each character.
Once you have the table, you can look for Glyph index in the table. Position in table where you find it is the character code you're looking for.

TrueType font specification (almost 500KB) can be found at Microsoft site somewhere, but you don't need to go in-depth analyzing file structure, just look for part about tables.


If you don't have font file name, but only have the full font name (which is different) you have to scan your whole fonts directory and extract font name field from every font (also in TTF font file specification), but then you can have problems not all fonts are copied in Fonts folder.

One more problem can occur:
If original text contains some characters that don't have a Glyph in this font that was used, for all those characters you will have one the same "missing glyph" index (typically looking like small square - you've probably seen it at least on the Web on some non-english sites - a lot of small squares instead of text)

- - -

If you want to convert codes of one specific font (some standard fonr like Currier New ot Times New Roman) it is cheaper for you to send on your own an input text which contain full character set and see which glyphs you will get, so you can mage a correspondence table between glyphs and characters.

- - -

To illustrate why is this so complicated I will only say this:
Your font doesn't have to be a text font, but it can contain some symbols, decoration elements or clipart.


If you get stuck with TTF format, just scream :o)

Greetings,
    Ntr:)
ASKER CERTIFIED SOLUTION
Avatar of Neutron
Neutron

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of amitd

ASKER

Ntr:

How do I get the Font name / Font File name from the parameters of "ExtTextOut". Please help, I was not able get the information from "ExtTextOut" paramters.

Regards,

amitd
Avatar of amitd

ASKER

Hi,

I got the font name can I get the file name for the same font. Without going and opening every file to check which file the font beloongs too.

Thanks,

amitd