Link to home
Start Free TrialLog in
Avatar of sam15
sam15

asked on

XML Entity-Referenes

A few questions on XML and oracle:

1. What are entiry referenes in XML? Are they only required for special characters?
2.  Are those entities "&#nnnn" same as ASCII representation?
3.  Does valid XML require to print special symbols/diacritics in numeric entity referenes?
4. How can i get the special characters from oracle database to print in those numeric entities so that XML parsers can read them since they can't seem to read the special character.

Thanks.
Avatar of Sean Stuber
Sean Stuber

1- they are abbreviations or numerical codes to represent a specific character.  Yes they are only for special characters.  However, while there are many codes defined, not all xml processors suport the various standards that defined them.  Only a handful are part of the xml standard itself and thus reasonably "guaranteed" to be supported   ( &apos; &quot; &amp; <  >)

2- no, the numbers are not necessarily the same as the ascii codes,  some aren't even ascii characters.  The playing card symbols for instance are not ascii characters

3 - I'm not sure what you mean by valid xml printing.  XML doesn't print anything. XML is just text in a special format.

4 - You can iterate through the characters using the SUBSTR function and translate them yourself into the codes you want.

wikipedia has a good starting reference and description of the codes.

http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
Avatar of sam15

ASKER

thanks. let me describe to you the issue and what I need.

1.  User A goes to a client  screen (either power builder or html) and enters a book title. This title may contain special characters/diacritics/latin. They usually enter it using ALT NNNN. i believe this is the standard way of entering those diactitics on windows (not sure)

http://www.forlang.wsu.edu/help/keyboards.asp

http://www.wyrdplay.org/AlanBeale/keyboard.html

2. correct me if i am wrong but i think the oracle 9i database in we9 character set saves those diacritics in ASCII format. I think it finds the  ASCII equivalent (not sure if it same as the one user types in or not) and saves it into DB (one byter per character).

3.  Now, i need to get an XML file printed on the browser so user B can save it and export it to another desktop publishing software. from what i am reading is that XML parsers fail reading those characters and they have to be translated to &#nnnn or XML entity format.

http://www.devx.com/tips/Tip/14068

So this is basically what i need to do is somehowe get oracle to print the valid XML code so that the XML can be valid and understood by other parsers.

thanks for you help
which characters are you referring to?  Anything in the 0-127 range should be fine because that's standard ASCII.

From 128-255 is extended ascii.  The "ALT-xxx" method is a windows/dos convention.  You will have to translate them yourself with a function that converts the extended characters to whatever characterset you want.
Avatar of sam15

ASKER

sdstuber

I a m little confuse. First from a client and DB prespective how do you enter diacritics in your UI and how does the database (we8 character set) really save them?
Can you simulate this in small sql*plus example or not.
ASKER CERTIFIED SOLUTION
Avatar of Sean Stuber
Sean Stuber

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial