Solved

How do i handle &nsbp; in stylesheet?

Posted on 2007-11-28
7
973 Views
Last Modified: 2013-11-18
I have a stylesheet which accepts xml (which happens to be html) from some process. The problem is that the process is generating &nsbp; which causes the xsl parser to crash.

From looking on the web i see that &nsbp is not allowed, a suggestion was to put the following ENTITY declaration in the header:

<!ENTITY nbsp CDATA "&#160;" >

<?xml version="1.0" encoding="Windows-1252" ?>
xsl code...

.. but i now get an xslt compile error. As i understand it the above declaration should replace all occurrances of &nbsp; to &#160; which xsl DOES handle. I suspect that the post i found on the web missed something from the above declaration. Can anyone correct the above or suggest another way for my xslt to handle &nbsp;

Thanks in advance :)
0
Comment
Question by:paddycobbett
  • 3
  • 3
7 Comments
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 20366529
yes, you are not doing it 100% right
it should be like this

<?xml version="1.0" encoding="Windows-1252" ?>
<!DOCTYPE xsl:stylesheet [
<!ENTITY nbsp CDATA "&#160;" >
]>
<xsl:stylesheet ...

0
 
LVL 13

Accepted Solution

by:
R7AF earned 350 total points
ID: 20366539
&nbsp; is not valid XML by default, unless defined inside the XML. This seems not to be the case, so the XML is not valid. You could read the XML in a string and replace the nbsp with &#160;

See http://www.experts-exchange.com/Q_22526834.html
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 20366574
> As i understand it the above declaration should replace all occurrances of &nbsp; to &#160; which xsl DOES handle

also make sure thatthe original XML has this declaration
because the XSLT only needs this doctype declaration if you use the &nbsp; in the stylesheet

the parser internally transforms the &nbsp; to a legal entity, before the xslt processor gets it
0
Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

 
LVL 1

Author Comment

by:paddycobbett
ID: 20366583
The suggestion by R7AF was a last resort, i had considered filtering that value from the process. If i can handle it in the stylesheet then that would be ideal. Gertone, you gave me a corrected version, but still results in the same error :S
0
 
LVL 1

Author Comment

by:paddycobbett
ID: 20366594
So the xml coming in should also have:

<?xml version="1.0" encoding="Windows-1252" ?>
<!DOCTYPE xsl:stylesheet [
<!ENTITY nbsp CDATA "&#160;" >
]>

?
0
 
LVL 60

Assisted Solution

by:Geert Bormans
Geert Bormans earned 150 total points
ID: 20366947
yes, correct, be it that the doctype should be different (it should be equal to the root element)
<?xml version="1.0" encoding="Windows-1252" ?>
<!DOCTYPE root [
<!ENTITY nbsp CDATA "&#160;" >
]>

if it does not have that, it is illegal XML,
that would mean that it is likely still HTML

The best approach to get HTML into wellformed XML, so you can handle it with XSLT is by preprocessing it using TagSoup
google for "download tagsoup" to get a copy
(it also handles encodings etc right)
after tagsoup you have wellformed XML and you can get it done with XSLT

I hesitate to recommend R7AFs approach, since that would mean processing the full XML
I would recommend stripping the declaration off, finding the root element and adding the doctype declaration
at that point you don't have to process the full file, but only the first line (or two)

You don't necesarily have to automate that
you can also change the process that generates your pseudoXML

cheers

Geert
0
 
LVL 1

Author Comment

by:paddycobbett
ID: 20367064
Thanks, having investigated the code base i'm working on it turned out to be more straight forward then i anticipated to insert code to strip off the &nsbp;

Thanks for both suggestions which i'm sure are valid. I've allocated more points to R7AF since it is the suggestion which suited me best in this case.
0

Featured Post

3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Unable to open debugger port in Intellij idea 6 211
Change to event 1 111
XML filtering Windows Event Viewer 10 55
Get the parent node - XMLTYPE 9 69
Jaspersoft Studio is a plugin for Eclipse that lets you create reports from a datasource.  In this article, we'll go over creating a report from a default template and setting up a datasource that connects to your database.
Shoutout to Emily Plummer (http://www.experts-exchange.com/members/eplummer26.html) for giving me this article! She did most of it, I just finished it up and posted it for her :)    Introduction In a previous article (http://www.experts-exchang…
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

813 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now