Want to continue parsing XML feed even though invalid character met (using Apache Xerces)

Hi,

I have an XML file from a http URL which I need to parse. When I do this using
Apache Xerces SAX parser, I get a fatal error with the message "Invalid byte 1 of 1-byte UTF-8 sequence". And then the parsing will be stopped. But I need to
continue parsing the file even if an invalid character is met. I don't mind
if that particular data is skipped. Can anybody advise (and give an example)?

Thanks in advance for the help.
org.xml.sax.SAXParseException: Invalid byte 1 of 1-byte UTF-8 sequence.
at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)

Open in new window

rnicholusAsked:
Who is Participating?
 
CEHJConnect With a Mentor Commented:
Try the following. Also make sure you're not reading as UTF-8 something that isn't actually UTF-8 encoded
parser.setFeature("http://apache.org/xml/features/continue-after-fatal-error", true);

Open in new window

0
 
rnicholusAuthor Commented:
It works. Thanks!
0
 
CEHJCommented:
:-)
0
All Courses

From novice to tech pro — start learning today.