XML Entity-Referenes

A few questions on XML and oracle:

1. What are entiry referenes in XML? Are they only required for special characters?
2.  Are those entities "&#nnnn" same as ASCII representation?
3.  Does valid XML require to print special symbols/diacritics in numeric entity referenes?
4. How can i get the special characters from oracle database to print in those numeric entities so that XML parsers can read them since they can't seem to read the special character.

Thanks.
sam15Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

sdstuberCommented:
1- they are abbreviations or numerical codes to represent a specific character.  Yes they are only for special characters.  However, while there are many codes defined, not all xml processors suport the various standards that defined them.  Only a handful are part of the xml standard itself and thus reasonably "guaranteed" to be supported   ( &apos; &quot; &amp; <  >)

2- no, the numbers are not necessarily the same as the ascii codes,  some aren't even ascii characters.  The playing card symbols for instance are not ascii characters

3 - I'm not sure what you mean by valid xml printing.  XML doesn't print anything. XML is just text in a special format.

4 - You can iterate through the characters using the SUBSTR function and translate them yourself into the codes you want.

wikipedia has a good starting reference and description of the codes.

http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
0
sam15Author Commented:
thanks. let me describe to you the issue and what I need.

1.  User A goes to a client  screen (either power builder or html) and enters a book title. This title may contain special characters/diacritics/latin. They usually enter it using ALT NNNN. i believe this is the standard way of entering those diactitics on windows (not sure)

http://www.forlang.wsu.edu/help/keyboards.asp

http://www.wyrdplay.org/AlanBeale/keyboard.html

2. correct me if i am wrong but i think the oracle 9i database in we9 character set saves those diacritics in ASCII format. I think it finds the  ASCII equivalent (not sure if it same as the one user types in or not) and saves it into DB (one byter per character).

3.  Now, i need to get an XML file printed on the browser so user B can save it and export it to another desktop publishing software. from what i am reading is that XML parsers fail reading those characters and they have to be translated to &#nnnn or XML entity format.

http://www.devx.com/tips/Tip/14068

So this is basically what i need to do is somehowe get oracle to print the valid XML code so that the XML can be valid and understood by other parsers.

thanks for you help
0
sdstuberCommented:
which characters are you referring to?  Anything in the 0-127 range should be fine because that's standard ASCII.

From 128-255 is extended ascii.  The "ALT-xxx" method is a windows/dos convention.  You will have to translate them yourself with a function that converts the extended characters to whatever characterset you want.
0
sam15Author Commented:
sdstuber

I a m little confuse. First from a client and DB prespective how do you enter diacritics in your UI and how does the database (we8 character set) really save them?
Can you simulate this in small sql*plus example or not.
0
sdstuberCommented:
using character sets is sort of like numeric precision.

If you try to stick  3.14159265 into an integer you can do it, but if you read the integer back out, you'll get 3.

Same with character sets. If you are reading characters in one character set, to ensure you maintain the integrity of your characters you should use the convert function and explicitly state from which set and to which set and test all of the characters you will be using.

If the convert shows a character losing integrity then you will have to change either the client side or db side character set, or possibly both.
SELECT   CONVERT('é', 'WE8MSWIN1252', 'US7ASCII') FROM DUAL;
 
SELECT   CONVERT('é', 'US7ASCII', 'WE8MSWIN1252') FROM DUAL;

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Oracle Database

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.