Link to home
Create AccountLog in
Avatar of loucker
loucker

asked on

[DOM] How to escape special character

Hi guys,

I have an exception when I try to parse a xml with Microsoft.XMLDOM, this XML contains the special character & (é).

I would like to know how to escape this character. I can't used CDATA.

Thanks in advance.
ASKER CERTIFIED SOLUTION
Avatar of leakim971
leakim971
Flag of Guadeloupe image

Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
But I need to give you some more understanding

é is an XML general entity.
If you use an entity in XML it needs to be declared in front of your document
(see the below example)
Since Unicode was not generally supported in SGML, non ANSI characters were added using entity declarations in the DTD similar to belows example
HTML inherited this approach. The only difference is that you don't need the entity declarations in a browser, because they are built in.
That is why you would find many HTML files without entity declarations, still being valid.
In XML you can only have &, <, &gt, " and ' without declaration (built in)
The others need a declaration.
But since XML supports full unicode, you can always add an entity numerically as in my example above,
and there is no need to add the complexity of general entities in your XML, at least not for characters
In short, the XML you are trying to parse is not valid, or at least uncomplete

If there is only a few entities that you need to replace
you can replace them with their numeric equivalent
as I showed you in my previous comment
If you have plenty of them, just add the full set of entities as in XHTML1
http://www.w3.org/TR/xhtml1/#h-A2
to your XML

So you have 2 options
1 replace each character entity with its numerical equivalent
2 add entity declarations to your XML as in below example (root needs to be the root element of your XML)

I would avoid leakims solution since it alters the document and forces you into postprocessing
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ENTITY eacute "&#233;">
]>
<root>&eacute;</root>

Open in new window