I agree with ray on the general principle that HTML to PDF has lots of bugs to it.
I've used http://www.digitaljunkies.
The least promising method you have is: - writing in any format, using a PDF printer, then somehow invoking OCR on the PDF
the difficulty here is that you need to render it in an app in order to print it, so it's just adding an extra place where things can go wrong.
Main Topics
Browse All Topics





by: Ray_PaseurPosted on 2009-09-17 at 06:20:58ID: 25355729
Some thought and some experience...
FPDF works great - fast and allows very precise positioning. I have even printed business cards with it!
MS Office 2007 has built-in "save as PDF" capabilities.
HTML-to-PDF is fraught with unpleasant surprises for the unsophisticated client. The usual beginning of the unfortunate conversation starts with, "Why doesn't it look the same way on the page as it does on my laptop?"
On the "thousands of documents, content blocks" front - that sounds like you need a data base.
If you want to give us more specifics, we may be able to offer more specific help. Best, ~Ray