[Webinar] Streamline your web hosting managementRegister Today

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 247
  • Last Modified:

extracting text from pdf files

Hi Experts,
I am looking for a way (library?) to extract the contents from a PDF file into my java swing application? Getting the images is not that critical but I need the text content to be as accurate as it can be.
The library should be open source &  free.
thanks
0
guyneo
Asked:
guyneo
  • 2
  • 2
  • 2
2 Solutions
 
leakim971PluritechnicianCommented:
Hello guyneo,

Apache.org one : http://pdfbox.apache.org/
Other :
http://www.qoppa.com/pdftext/jptindex.html

Regards.
0
 
mohan_sekarCommented:
What does the PDF have? searchable Text or text in images? If it's the former, you can use iText, which is free. If it's the later you need to OCR it. I'm not aware of any free OCR tool.
0
 
guyneoAuthor Commented:
Text and some Images. We are not dealing with text inside the iamges.
0
The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

 
mohan_sekarCommented:
iText can help
0
 
leakim971PluritechnicianCommented:
Thanks for the points!
0
 
guyneoAuthor Commented:
You are welcome.
Thanks you both
0

Featured Post

Take Control of Web Hosting For Your Clients

As a web developer or IT admin, successfully managing multiple client accounts can be challenging. In this webinar we will look at the tools provided by Media Temple and Plesk to make managing your clients’ hosting easier.

  • 2
  • 2
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now