I am looking for a way of bulk OCRing around 50,000 PDF's to make them searchable. There are around 6 million pages in total.
What would be the best software and hardware to do this on - for a reasonable cost (Acrobat Capture is too expensive)
The system needs to be robust - and automatic - so in the event of a crash it will automatically restart OCRing. (for example Acrobat Pro does not seem to do this) It also needs to move the processed document to a new folder.
I am happy to use either Windows based software or Linux based.