Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 365
  • Last Modified:

PDF to text/HTML

How can I convert PDF to text, XML or HTML with Java or some other language like Python.

Is there some good solution that works. I know there are many commercial libraries. Which one should I choose.
It needs to convert PDF documents without errors.
0
mbutu
Asked:
mbutu
4 Solutions
 
WesleySaysHiCommented:
About libraries, this may be the best:

"JPedal is a 100% Java library designed to ease the integration of pdf files into any workflow. concentrating on the easy display, manipulation and extraction of content, JPedal is an essential tool for pdf developers." Go at:
http://www.jpedal.org/

Other tools:

Java libraries to read and write PDF files you can find here:
http://www.geocities.com/marcoschmidt.geo/java-libraries-pdf.html

There is a software you can use 14 days for free which converts PDF to text without errors. "Midas Extractor makes it easy to convert from PDF to plain text.  The text within the PDF file is extracted and copied into a text file of the same name as the PDF, but with a txt extension." You can find it at: http://www.surefiresoftware.com/midas/main.php?
There is a software to convert PDF to HTML: "The PDF2HTML (PDF to HTML) software product converts PDF files to HTML files while seeking to preserve the original page layout (as best as technically possible). PDF2HTML enables the conversion of layout originally designed for paper to be used on the Internet." You can find it here along with other conversion software:
http://www.verypdf.com/

Regards,
Wesley
0
 
apurvkansalCommented:
Hi mbutu,

Try visiting the link below, I hope u get ur solution there.

http://www.convertzone.com/

Cheers,
AK
0
 
sonashishCommented:
You can also try productsof ABBYSoftware FInereader software. I used very frquently.

Http://www.abbysoftware.com


Or you can use PDF2HTML Driver, it is kust like printer driver. As I remember it is free.

One another option is click2convert.

Ashish

0
 
itcnbwiseCommented:
I use the free XPDF - works great:

http://www.foolabs.com/xpdf/download.html

Both Linux and Win32/DOS versions available.  Upload a PDF and see it convert documents on the fly on my website here:
http://forumbeta.itcn.com/forum.aspx
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now