Solved

How do i handle &nsbp; in stylesheet?

Posted on 2007-11-28
7
976 Views
Last Modified: 2013-11-18
I have a stylesheet which accepts xml (which happens to be html) from some process. The problem is that the process is generating &nsbp; which causes the xsl parser to crash.

From looking on the web i see that &nsbp is not allowed, a suggestion was to put the following ENTITY declaration in the header:

<!ENTITY nbsp CDATA "&#160;" >

<?xml version="1.0" encoding="Windows-1252" ?>
xsl code...

.. but i now get an xslt compile error. As i understand it the above declaration should replace all occurrances of &nbsp; to &#160; which xsl DOES handle. I suspect that the post i found on the web missed something from the above declaration. Can anyone correct the above or suggest another way for my xslt to handle &nbsp;

Thanks in advance :)
0
Comment
Question by:paddycobbett
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
7 Comments
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 20366529
yes, you are not doing it 100% right
it should be like this

<?xml version="1.0" encoding="Windows-1252" ?>
<!DOCTYPE xsl:stylesheet [
<!ENTITY nbsp CDATA "&#160;" >
]>
<xsl:stylesheet ...

0
 
LVL 13

Accepted Solution

by:
R7AF earned 350 total points
ID: 20366539
&nbsp; is not valid XML by default, unless defined inside the XML. This seems not to be the case, so the XML is not valid. You could read the XML in a string and replace the nbsp with &#160;

See http://www.experts-exchange.com/Q_22526834.html
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 20366574
> As i understand it the above declaration should replace all occurrances of &nbsp; to &#160; which xsl DOES handle

also make sure thatthe original XML has this declaration
because the XSLT only needs this doctype declaration if you use the &nbsp; in the stylesheet

the parser internally transforms the &nbsp; to a legal entity, before the xslt processor gets it
0
More Than Just A Video Library

Train for your certification. Learn the latest DevOps tools. Grow your skillset to do better work.

At Linux Academy, we release new training modules every week so you'll always be up to date on the latest tech.

 
LVL 1

Author Comment

by:paddycobbett
ID: 20366583
The suggestion by R7AF was a last resort, i had considered filtering that value from the process. If i can handle it in the stylesheet then that would be ideal. Gertone, you gave me a corrected version, but still results in the same error :S
0
 
LVL 1

Author Comment

by:paddycobbett
ID: 20366594
So the xml coming in should also have:

<?xml version="1.0" encoding="Windows-1252" ?>
<!DOCTYPE xsl:stylesheet [
<!ENTITY nbsp CDATA "&#160;" >
]>

?
0
 
LVL 60

Assisted Solution

by:Geert Bormans
Geert Bormans earned 150 total points
ID: 20366947
yes, correct, be it that the doctype should be different (it should be equal to the root element)
<?xml version="1.0" encoding="Windows-1252" ?>
<!DOCTYPE root [
<!ENTITY nbsp CDATA "&#160;" >
]>

if it does not have that, it is illegal XML,
that would mean that it is likely still HTML

The best approach to get HTML into wellformed XML, so you can handle it with XSLT is by preprocessing it using TagSoup
google for "download tagsoup" to get a copy
(it also handles encodings etc right)
after tagsoup you have wellformed XML and you can get it done with XSLT

I hesitate to recommend R7AFs approach, since that would mean processing the full XML
I would recommend stripping the declaration off, finding the root element and adding the doctype declaration
at that point you don't have to process the full file, but only the first line (or two)

You don't necesarily have to automate that
you can also change the process that generates your pseudoXML

cheers

Geert
0
 
LVL 1

Author Comment

by:paddycobbett
ID: 20367064
Thanks, having investigated the code base i'm working on it turned out to be more straight forward then i anticipated to insert code to strip off the &nsbp;

Thanks for both suggestions which i'm sure are valid. I've allocated more points to R7AF since it is the suggestion which suited me best in this case.
0

Featured Post

Quiz: What Do These Organizations Have In Common?

Hint: Their teams ended up taking quizzes, too.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Shoutout to Emily Plummer (http://www.experts-exchange.com/members/eplummer26.html) for giving me this article! She did most of it, I just finished it up and posted it for her :)    Introduction In a previous article (http://www.experts-exchang…
Introduction Since I wrote the original article about Handling Date and Time in PHP and MySQL several years ago, it seemed like now was a good time to update it for object-oriented PHP.  This article does that, replacing as much as possible the pr…
The viewer will receive an overview of the basics of CSS showing inline styles. In the head tags set up your style tags: (CODE) Reference the nav tag and set your properties.: (CODE) Set the reference for the UL element and styles for it to ensu…
The viewer will learn the basics of jQuery, including how to invoke it on a web page. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery.: (CODE)

691 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question