Solved

PDF to text/HTML

Posted on 2004-09-29
6
348 Views
Last Modified: 2009-05-07
How can I convert PDF to text, XML or HTML with Java or some other language like Python.

Is there some good solution that works. I know there are many commercial libraries. Which one should I choose.
It needs to convert PDF documents without errors.
0
Comment
Question by:mbutu
6 Comments
 
LVL 5

Accepted Solution

by:
WesleySaysHi earned 125 total points
ID: 12178902
About libraries, this may be the best:

"JPedal is a 100% Java library designed to ease the integration of pdf files into any workflow. concentrating on the easy display, manipulation and extraction of content, JPedal is an essential tool for pdf developers." Go at:
http://www.jpedal.org/

Other tools:

Java libraries to read and write PDF files you can find here:
http://www.geocities.com/marcoschmidt.geo/java-libraries-pdf.html

There is a software you can use 14 days for free which converts PDF to text without errors. "Midas Extractor makes it easy to convert from PDF to plain text.  The text within the PDF file is extracted and copied into a text file of the same name as the PDF, but with a txt extension." You can find it at: http://www.surefiresoftware.com/midas/main.php?
There is a software to convert PDF to HTML: "The PDF2HTML (PDF to HTML) software product converts PDF files to HTML files while seeking to preserve the original page layout (as best as technically possible). PDF2HTML enables the conversion of layout originally designed for paper to be used on the Internet." You can find it here along with other conversion software:
http://www.verypdf.com/

Regards,
Wesley
0
 

Assisted Solution

by:apurvkansal
apurvkansal earned 125 total points
ID: 12178959
Hi mbutu,

Try visiting the link below, I hope u get ur solution there.

http://www.convertzone.com/

Cheers,
AK
0
 
LVL 2

Assisted Solution

by:sonashish
sonashish earned 125 total points
ID: 12179128
You can also try productsof ABBYSoftware FInereader software. I used very frquently.

Http://www.abbysoftware.com


Or you can use PDF2HTML Driver, it is kust like printer driver. As I remember it is free.

One another option is click2convert.

Ashish

0
 
LVL 4

Assisted Solution

by:itcnbwise
itcnbwise earned 125 total points
ID: 12181264
I use the free XPDF - works great:

http://www.foolabs.com/xpdf/download.html

Both Linux and Win32/DOS versions available.  Upload a PDF and see it convert documents on the fly on my website here:
http://forumbeta.itcn.com/forum.aspx
0

Featured Post

Gigs: Get Your Project Delivered by an Expert

Select from freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

In this post we will learn how to connect and configure Android Device (Smartphone etc.) with Android Studio. After that we will run a simple Hello World Program.
Computer science students often experience many of the same frustrations when going through their engineering courses. This article presents seven tips I found useful when completing a bachelors and masters degree in computing which I believe may he…
An introduction to basic programming syntax in Java by creating a simple program. Viewers can follow the tutorial as they create their first class in Java. Definitions and explanations about each element are given to help prepare viewers for future …
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …

776 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question