Solved

how to extract the text from pdf using PHP?

Posted on 2009-07-10
2
2,005 Views
Last Modified: 2013-12-13
i have converted the pdf to images, but now i need to extract the content from pdf as text or html, how can i do it.

0
Comment
Question by:Rajmd
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 110

Accepted Solution

by:
Ray Paseur earned 250 total points
ID: 24823642
This may be either a big undertaking or an impossible dream, depending on what you have got in the PDF file.  You are probably better off to go back to the original data BEFORE it became a PDF.  If you cannot get that information in clear text, here is the path to follow...

You can read the PDF files into PHP with file_get_contents();

You can use var_dump() to print out the data you read from the PDF.

You can visually scan the data string for extraction points and perhaps create a REGEX or a set of explode() statements to pull the information you want.

Do not become too dependent on this technology - different levels of PDF files will have different encodings and you may not be able to control what you will find in there.

Best of luck with your project, ~Ray
0
 
LVL 3

Expert Comment

by:Pedro Chagas
ID: 24825600
What is the goal? Objective?
Where you get the pdf's? You create your own pdf's? If so, I think you can use file_get_contents() (like @ray tells you), because the encoding its always the same!

Regards, JC
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Build an array called $myWeek which will hold the array elements Today, Yesterday and then builds up the rest of the week by the name of the day going back 1 week.   (CODE) (CODE) Then you just need to pass your date to the function. If i…
This article discusses four methods for overlaying images in a container on a web page
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …
In an interesting question (https://www.experts-exchange.com/questions/29008360/) here at Experts Exchange, a member asked how to split a single image into multiple images. The primary usage for this is to place many photographs on a flatbed scanner…

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question