Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

I could not parse my xml into xsl for PDF generation because of Unicode 0xB charactor. Any solution?

Posted on 2009-04-10
2
Medium Priority
?
294 Views
Last Modified: 2012-08-13
I could not render my xml since  its getting Unicode special charactors while transforming it into fo:xsl for PDF generation  Any solution?
0
Comment
Question by:manishe
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 39

Accepted Solution

by:
abel earned 2000 total points
ID: 24125890
In XML there are several characters that are "forbidden". That means, if certain characters (better say: codepoints) appear in your XML, then the XML is not valid XML, which, in XML-terms means: it is not XML at all and cannot be processed.

In such cases, the best way to deal with the problem is to go to the one (system, person, company) that gives you the XML and ask them to deliver real XML and not some crippled substrate. In some cases, however, there can be reasons to get along with the incorrect XML.

In your case, you are saying that the Unicode character 0xB is giving you troubles. 0xB is known in Unicode as "Line Tabulation" or "Vertical Tab (VT)". This is indeed an illegal character (the first allowed characters in XML are only the whitespace characters: 0x9, 0xA , 0xD, 0x20, which do not include the VT).

In XML 1.1 the VT is allowed, but only when properly escaped. Switching to the (poorly supported) XML 1.1 is not going to help you.

That leaves us to non-standard treatment of the data. We can do that in many ways, but the correct approach depends on what tools you use. Consider using a filter (if you application allows that) to remove those erroneous characters.

Note: it can happen that suppliers of XML use the wrong encoding in the header, or that readers/parsers of the data use the wrong encoding, i.e., when UTF8 is assumed, but ISO-8859-1 is used. If that is the case, we should look deeper into your parsing of the XML and the actual error you get.

-- Abel --
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Browsing the questions asked to the Experts of this forum, you will be amazed to see how many times people are headaching about monster regular expressions (regex) to select that specific part of some HTML or XML file they want to extract. The examp…
The Confluence of Individual Knowledge and the Collective Intelligence At this writing (summer 2013) the term API (http://dictionary.reference.com/browse/API?s=t) has made its way into the popular lexicon of the English language.  A few years ago, …
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…
Suggested Courses

604 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question