Solved

Scanned pages into pdf

Posted on 2006-06-27
3
309 Views
Last Modified: 2010-04-17
hello all does anyone  know how are scanned pages containing text converted into pdf and what is the format in which text is stored in the pdf. Can this text be directly extracted from the pdf or some technique like OCR is required to be used.

Thanks
0
Comment
Question by:jhav1594
3 Comments
 
LVL 1

Accepted Solution

by:
jm021196 earned 500 total points
ID: 16994505
It reallly depends on what app you are using and the quality of the page.

If the PDF Converting program which is being used to take the image from the scanner can recognise the text as text then its stored as text in the PDF File.

If the converting program cannot recognise it as text then it gets saved in a variety of image formats depending on which one suits it best. There really is no way to tell how its saved in advance.

PDF Files use a combination of vector, raster and text formates to give the best compression and viewability and so converting to PDF is a very difficult thing to undo... especiall if its not possible to tell in advance if its going to be in text or not.

I would suggest that a OCR system is the best way forward.

Thanks
mitch
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

Suggested Solutions

Purpose To explain how to place a textual stamp on a PDF document.  This is commonly referred to as an annotation, or possibly a watermark, but a watermark is generally different in that it is somewhat translucent.  Watermark’s may be text or graph…
Whether you’re a college noob or a soon-to-be pro, these tips are sure to help you in your journey to becoming a programming ninja and stand out from the crowd.
An introduction to basic programming syntax in Java by creating a simple program. Viewers can follow the tutorial as they create their first class in Java. Definitions and explanations about each element are given to help prepare viewers for future …
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now