• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 369
  • Last Modified:

PDF to text/HTML

How can I convert PDF to text, XML or HTML with Java or some other language like Python.

Is there some good solution that works. I know there are many commercial libraries. Which one should I choose.
It needs to convert PDF documents without errors.
0
mbutu
Asked:
mbutu
4 Solutions
 
WesleySaysHiCommented:
About libraries, this may be the best:

"JPedal is a 100% Java library designed to ease the integration of pdf files into any workflow. concentrating on the easy display, manipulation and extraction of content, JPedal is an essential tool for pdf developers." Go at:
http://www.jpedal.org/

Other tools:

Java libraries to read and write PDF files you can find here:
http://www.geocities.com/marcoschmidt.geo/java-libraries-pdf.html

There is a software you can use 14 days for free which converts PDF to text without errors. "Midas Extractor makes it easy to convert from PDF to plain text.  The text within the PDF file is extracted and copied into a text file of the same name as the PDF, but with a txt extension." You can find it at: http://www.surefiresoftware.com/midas/main.php?
There is a software to convert PDF to HTML: "The PDF2HTML (PDF to HTML) software product converts PDF files to HTML files while seeking to preserve the original page layout (as best as technically possible). PDF2HTML enables the conversion of layout originally designed for paper to be used on the Internet." You can find it here along with other conversion software:
http://www.verypdf.com/

Regards,
Wesley
0
 
apurvkansalCommented:
Hi mbutu,

Try visiting the link below, I hope u get ur solution there.

http://www.convertzone.com/

Cheers,
AK
0
 
sonashishCommented:
You can also try productsof ABBYSoftware FInereader software. I used very frquently.

Http://www.abbysoftware.com


Or you can use PDF2HTML Driver, it is kust like printer driver. As I remember it is free.

One another option is click2convert.

Ashish

0
 
itcnbwiseCommented:
I use the free XPDF - works great:

http://www.foolabs.com/xpdf/download.html

Both Linux and Win32/DOS versions available.  Upload a PDF and see it convert documents on the fly on my website here:
http://forumbeta.itcn.com/forum.aspx
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now