?
Solved

How do i handle &nsbp; in stylesheet?

Posted on 2007-11-28
7
Medium Priority
?
977 Views
Last Modified: 2013-11-18
I have a stylesheet which accepts xml (which happens to be html) from some process. The problem is that the process is generating &nsbp; which causes the xsl parser to crash.

From looking on the web i see that &nsbp is not allowed, a suggestion was to put the following ENTITY declaration in the header:

<!ENTITY nbsp CDATA "&#160;" >

<?xml version="1.0" encoding="Windows-1252" ?>
xsl code...

.. but i now get an xslt compile error. As i understand it the above declaration should replace all occurrances of &nbsp; to &#160; which xsl DOES handle. I suspect that the post i found on the web missed something from the above declaration. Can anyone correct the above or suggest another way for my xslt to handle &nbsp;

Thanks in advance :)
0
Comment
Question by:paddycobbett
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
7 Comments
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 20366529
yes, you are not doing it 100% right
it should be like this

<?xml version="1.0" encoding="Windows-1252" ?>
<!DOCTYPE xsl:stylesheet [
<!ENTITY nbsp CDATA "&#160;" >
]>
<xsl:stylesheet ...

0
 
LVL 13

Accepted Solution

by:
R7AF earned 1400 total points
ID: 20366539
&nbsp; is not valid XML by default, unless defined inside the XML. This seems not to be the case, so the XML is not valid. You could read the XML in a string and replace the nbsp with &#160;

See http://www.experts-exchange.com/Q_22526834.html
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 20366574
> As i understand it the above declaration should replace all occurrances of &nbsp; to &#160; which xsl DOES handle

also make sure thatthe original XML has this declaration
because the XSLT only needs this doctype declaration if you use the &nbsp; in the stylesheet

the parser internally transforms the &nbsp; to a legal entity, before the xslt processor gets it
0
Optimize your web performance

What's in the eBook?
- Full list of reasons for poor performance
- Ultimate measures to speed things up
- Primary web monitoring types
- KPIs you should be monitoring in order to increase your ROI

 
LVL 1

Author Comment

by:paddycobbett
ID: 20366583
The suggestion by R7AF was a last resort, i had considered filtering that value from the process. If i can handle it in the stylesheet then that would be ideal. Gertone, you gave me a corrected version, but still results in the same error :S
0
 
LVL 1

Author Comment

by:paddycobbett
ID: 20366594
So the xml coming in should also have:

<?xml version="1.0" encoding="Windows-1252" ?>
<!DOCTYPE xsl:stylesheet [
<!ENTITY nbsp CDATA "&#160;" >
]>

?
0
 
LVL 60

Assisted Solution

by:Geert Bormans
Geert Bormans earned 600 total points
ID: 20366947
yes, correct, be it that the doctype should be different (it should be equal to the root element)
<?xml version="1.0" encoding="Windows-1252" ?>
<!DOCTYPE root [
<!ENTITY nbsp CDATA "&#160;" >
]>

if it does not have that, it is illegal XML,
that would mean that it is likely still HTML

The best approach to get HTML into wellformed XML, so you can handle it with XSLT is by preprocessing it using TagSoup
google for "download tagsoup" to get a copy
(it also handles encodings etc right)
after tagsoup you have wellformed XML and you can get it done with XSLT

I hesitate to recommend R7AFs approach, since that would mean processing the full XML
I would recommend stripping the declaration off, finding the root element and adding the doctype declaration
at that point you don't have to process the full file, but only the first line (or two)

You don't necesarily have to automate that
you can also change the process that generates your pseudoXML

cheers

Geert
0
 
LVL 1

Author Comment

by:paddycobbett
ID: 20367064
Thanks, having investigated the code base i'm working on it turned out to be more straight forward then i anticipated to insert code to strip off the &nsbp;

Thanks for both suggestions which i'm sure are valid. I've allocated more points to R7AF since it is the suggestion which suited me best in this case.
0

Featured Post

Get real performance insights from real users

Key features:
- Total Pages Views and Load times
- Top Pages Viewed and Load Times
- Real Time Site Page Build Performance
- Users’ Browser and Platform Performance
- Geographic User Breakdown
- And more

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The Confluence of Individual Knowledge and the Collective Intelligence At this writing (summer 2013) the term API (http://dictionary.reference.com/browse/API?s=t) has made its way into the popular lexicon of the English language.  A few years ago, …
Jaspersoft Studio is a plugin for Eclipse that lets you create reports from a datasource.  In this article, we'll go over creating a report from a default template and setting up a datasource that connects to your database.
The viewer will learn the basics of jQuery, including how to invoke it on a web page. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery.: (CODE)
The viewer will learn the basics of jQuery including how to code hide show and toggles. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery…
Suggested Courses

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question