I am trying to use ROME (Rss and atOM utilitiEs - https://rome.dev.java.net/
) to build a Java program to read an RSS feed that is ISO-8859-1 encoded.
I use com.sun.syndication.io.Xml
Reader to read the remote file, but all the accents ("´", "`", "^", "~", etc.) are being lost, probably because the encoding is not being properly recognized.
Here is my example code:
String feed = "http://somedomain/some_rss_feed.xml
URL feedUrl = new URL(feed);
XmlReader reader = new XmlReader(feedUrl);
SyndFeedInput input = new SyndFeedInput();
SyndFeed result = input.build(reader);
The structure of the RSS feed (which is NOT under my control, so I have no ways to correct anything wrong related to it...) is like below:
<?xml version="1.0" encoding="ISO-8859-1"?><!D
OCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd
<title>Some title already wíth ãny àccênts intö it</title>
When variable "reader" gets the result of "new XmlReader(feedUrl)", it already shows me a property named "_encoding" filled with value US-ASCII instead of ISO-8859-1.
And when I check the variable "result" for its contents, it has already all the attributes filled with the values which were read from the XML feed, but with all my accents already corrupted...