Solved

How do i handle &nsbp; in stylesheet?

Posted on 2007-11-28
7
975 Views
Last Modified: 2013-11-18
I have a stylesheet which accepts xml (which happens to be html) from some process. The problem is that the process is generating &nsbp; which causes the xsl parser to crash.

From looking on the web i see that &nsbp is not allowed, a suggestion was to put the following ENTITY declaration in the header:

<!ENTITY nbsp CDATA "&#160;" >

<?xml version="1.0" encoding="Windows-1252" ?>
xsl code...

.. but i now get an xslt compile error. As i understand it the above declaration should replace all occurrances of &nbsp; to &#160; which xsl DOES handle. I suspect that the post i found on the web missed something from the above declaration. Can anyone correct the above or suggest another way for my xslt to handle &nbsp;

Thanks in advance :)
0
Comment
Question by:paddycobbett
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
7 Comments
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 20366529
yes, you are not doing it 100% right
it should be like this

<?xml version="1.0" encoding="Windows-1252" ?>
<!DOCTYPE xsl:stylesheet [
<!ENTITY nbsp CDATA "&#160;" >
]>
<xsl:stylesheet ...

0
 
LVL 13

Accepted Solution

by:
R7AF earned 350 total points
ID: 20366539
&nbsp; is not valid XML by default, unless defined inside the XML. This seems not to be the case, so the XML is not valid. You could read the XML in a string and replace the nbsp with &#160;

See http://www.experts-exchange.com/Q_22526834.html
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 20366574
> As i understand it the above declaration should replace all occurrances of &nbsp; to &#160; which xsl DOES handle

also make sure thatthe original XML has this declaration
because the XSLT only needs this doctype declaration if you use the &nbsp; in the stylesheet

the parser internally transforms the &nbsp; to a legal entity, before the xslt processor gets it
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 1

Author Comment

by:paddycobbett
ID: 20366583
The suggestion by R7AF was a last resort, i had considered filtering that value from the process. If i can handle it in the stylesheet then that would be ideal. Gertone, you gave me a corrected version, but still results in the same error :S
0
 
LVL 1

Author Comment

by:paddycobbett
ID: 20366594
So the xml coming in should also have:

<?xml version="1.0" encoding="Windows-1252" ?>
<!DOCTYPE xsl:stylesheet [
<!ENTITY nbsp CDATA "&#160;" >
]>

?
0
 
LVL 60

Assisted Solution

by:Geert Bormans
Geert Bormans earned 150 total points
ID: 20366947
yes, correct, be it that the doctype should be different (it should be equal to the root element)
<?xml version="1.0" encoding="Windows-1252" ?>
<!DOCTYPE root [
<!ENTITY nbsp CDATA "&#160;" >
]>

if it does not have that, it is illegal XML,
that would mean that it is likely still HTML

The best approach to get HTML into wellformed XML, so you can handle it with XSLT is by preprocessing it using TagSoup
google for "download tagsoup" to get a copy
(it also handles encodings etc right)
after tagsoup you have wellformed XML and you can get it done with XSLT

I hesitate to recommend R7AFs approach, since that would mean processing the full XML
I would recommend stripping the declaration off, finding the root element and adding the doctype declaration
at that point you don't have to process the full file, but only the first line (or two)

You don't necesarily have to automate that
you can also change the process that generates your pseudoXML

cheers

Geert
0
 
LVL 1

Author Comment

by:paddycobbett
ID: 20367064
Thanks, having investigated the code base i'm working on it turned out to be more straight forward then i anticipated to insert code to strip off the &nsbp;

Thanks for both suggestions which i'm sure are valid. I've allocated more points to R7AF since it is the suggestion which suited me best in this case.
0

Featured Post

Webinar: Aligning, Automating, Winning

Join Dan Russo, Senior Manager of Operations Intelligence, for an in-depth discussion on how Dealertrack, leading provider of integrated digital solutions for the automotive industry, transformed their DevOps processes to increase collaboration and move with greater velocity.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Introduction Knockoutjs (Knockout) is a JavaScript framework (Model View ViewModel or MVVM framework).   The main ideology behind Knockout is to control from JavaScript how a page looks whilst creating an engaging user experience in the least …
JavaScript has plenty of pieces of code people often just copy/paste from somewhere but never quite fully understand. Self-Executing functions are just one good example that I'll try to demystify here.
The viewer will learn how to use and create keystrokes in Netbeans IDE 8.0 for Windows.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question