Expiring Today—Celebrate National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

JAXB XML Parsing Question

Posted on 2011-03-01
7
Medium Priority
?
432 Views
Last Modified: 2013-11-19
Is there any way to configure JAXB to, during parsing of an xml document, ignore any text following the closing tag? It seems to ignore any number of spaces following the closing tag, but even one non-space character causes it to throw a SAXParseException.  We have a few xml documents with characters following the closing tag, but otherwise they are fine. Until we can clean them up, it would be great to be able to throw a switch somewhere to say, when you reach the closing tag, forget anything that might be beyond it.  This exception is being thrown durng unmarshalling.  
0
Comment
Question by:whandley
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
7 Comments
 
LVL 27

Accepted Solution

by:
mrcoffee365 earned 2000 total points
ID: 35017463
We have not found a way to do this.  What we do is scrub the xml before sending it to the parser.  Or in some cases, sending it to the parser, catching the exception, scrubbing, then sending it to the parser again.
0
 
LVL 10

Expert Comment

by:Hegemon
ID: 35018614
Please correct me if I am wrong, but it looks like illegitimate (non-whitespace) characters after closing tags make the document not well-formed, so, strictly speaking, it is no longer a valid XML document and cannot be processed by XML parser.

Either the document needs to be made valid XML by scrubbing it  or a non-XML parser used.

Problems of this sort can be expected when working with SGML documents that may look like XML but are not well formed.

0
 
LVL 27

Expert Comment

by:mrcoffee365
ID: 35019773
Yes -- I already gave that answer.
0
U.S. Department of Agriculture and Acronis Access

With the new era of mobile computing, smartphones and tablets, wireless communications and cloud services, the USDA sought to take advantage of a mobilized workforce and the blurring lines between personal and corporate computing resources.

 
LVL 10

Expert Comment

by:Hegemon
ID: 35020010
My point was not about scrubbing it per se but rather about the document not being an XML document, hence XML parsing not applicable.
0
 
LVL 27

Expert Comment

by:mrcoffee365
ID: 35021033
XML docs come in many forms.   It's still an XML doc even if it has some characters in the file after the closing tag.  It is not a well-formed XML doc, which is what the asker was asking about.

As you get more experience with XML docs, you'll find that many are not well-formed, and the developers have to have strategies to deal with that.
0
 
LVL 10

Expert Comment

by:Hegemon
ID: 35025543
"Definition: A data object is an XML document if it is well-formed, as defined in this specification.", from here http://www.w3.org/TR/REC-xml/#sec-well-formed.

Hence not well formed - not an XML
0
 
LVL 5

Expert Comment

by:Plk_In_EE
ID: 35056185
Hi there
even if there gs a white space before the <xml tag in the document the sax parser will fail
better we send a well formatted xml to parser . open the xml in a browser to if its valid oNe Or not
good luck
0

Featured Post

Understanding Web Applications

Without even knowing it, most of us are using web applications on a daily basis. Gmail and Yahoo email, Twitter, Facebook, and eBay are used by most of us daily—and they are web applications. We often confuse these web applications tools for websites.  So, what is the difference?

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When it comes to write a Context Sensitive Help (an online help that is obtained from a specific point in state of software to provide help with that state) ,  first we need to make the file that contains all topics, which are given exclusive IDs. …
Originally, this post was published on Monitis Blog, you can check it here . It goes without saying that technology has transformed society and the very nature of how we live, work, and communicate in ways that would’ve been incomprehensible 5 ye…
Learn how to create flexible layouts using relative units in CSS.  New relative units added in CSS3 include vw(viewports width), vh(viewports height), vmin(minimum of viewports height and width), and vmax (maximum of viewports height and width).
The is a quite short video tutorial. In this video, I'm going to show you how to create self-host WordPress blog with free hosting service.
Suggested Courses

718 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question