Xerox WorkCentre 7830 scanning documents to DOC files instead of PDF's

Hello I have a Xerox WorkCentre 7830 when I scan a document the only options I have is PDF, PDF/A, XPS, TIFF, JPEG but I need to scan in some documents and output to Word documents instead of pdf's ....I will be scanning in about 5000 documents so I am not sure if I should look into allowing the scanning to make all those pdf's and then find some app to convert pdf's to word doc's or do you guys have a better suggestion ...
Deerek11Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

web_trackerComputer Service TechnicianCommented:
obviously you are looking at something that you can edit in word. Since the printer does not have MS office installed, it can not convert a scanned document into a doc file. This would make the cost of buying the Xerox printer expensive because it would require a license to use the MS office suite.  If you scan to an application on your computer such as word, then the word application would open up, but the scan would most likely be picture pasted into the word document. Something that can not be edited. I think the best way to work around this is scan it as a pdf and then convert it to a word document using some type of conversion software.
0
Deerek11Author Commented:
Any suggestions on a free to low cost application that can handle 5000 pdf's to doc's
0
hdhondtCommented:
When you scan a document, all you end up with is pixels (dots on the page). There is no text at all in a scan. If you need text, you would need to start with a text-recognition program. As you require something low cost, you can try something like Free OCR to Word.

As the scanner will still only produce image files, the easiest way to do it is to scan from within the OCR software, which Free OCR to Word will do. Unfortunately, you then need to manually scan each page. Another way is to use a package that can scan from a command line, and then run a script that converts each file as it is scanned. Something like SimpleOCR can do that.

Another problem is multi-page documents. The printer may be able to create multi-page PDFs, but most OCR packages require TIF format, which limits them to single pages. Your best bet would be a high-end program like OmniPage. That would do the whole job at once. It will take a full directory of TIF or PDF files, and output them into DOC format.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Microsoft Azure 2017

Azure has a changed a lot since it was originally introduce by adding new services and features. Do you know everything you need to about Azure? This course will teach you about the Azure App Service, monitoring and application insights, DevOps, and Team Services.

Deerek11Author Commented:
So are you saying that after I scan in all the documents and it created pdf's that I will still need some OCR software or can I just use a pdf to word convertor application. ...Also the scanner can create XPS file would this be easier then pdf's
0
hdhondtCommented:
No matter what output format the scanner gives you (PDF, TIF, JPG, XPS, etc), the file will still not contain any text or characters. It is effectively a photo of your document. If you look at that photo, the OCR in your brain lets you read the text. In the case of a computer, that photo needs to be converted to text by OCR software.

Even if you used software to convert the PDF to DOC format, what you would then have is a Word document that contains 1 large image per page - no text. And, if there is no text, you cannot search it or modify it. Hence the conversion is pointless: you may as well look at the image from a PDF as from a DOC.
0
web_trackerComputer Service TechnicianCommented:
When a document is converted from pdf to a Doc, most of the text will be recognized as text and CAN be edited. I have just confirmed the fact that it does work. by scanning a poster that had both text and graphics. I scanned it as a pdf and then used adobe Acrobat standard edition to convert it to a word document and it showed the special graphics as a picture and the text it was normal text that could be edited.  You do not need adobe acrobat pro, as adobe acrobat standard with convert it to word. Often when you buy a high end scanner adobe acrobat standard is shipped with the scanner. That's how I obtained my copy.
0
hdhondtCommented:
@web_tracker

You are correct - as long as the PDF contains text. Unfortunately, a scan does not contain any text, only an image.
0
web_trackerComputer Service TechnicianCommented:
If you scan a document as a jpg or any other format then the whole scan would be a bunch of pixels and you can not edit the text on the page, but if you scan as a pdf any pictures will not be editable but the text in the pdf document will be recognized as text and it can indeed be edited. I have tried it and I was able to edit the text after scanning a poster to a pdf, I was able to edit the text portions of the poster after converting the pdf to word document. You do not need to have OCR software to convert the pixels to letters, this is already built into the pdf software. Years ago you needed ocr software to convert the pixels to text, but the pdf software such as adobe acrobat has the ocr software built into their software. I know this from experience scanning documents to a pdf and then being able to edit the text on the scanned document. Note if you scan in a picture format such as jpeg, gif, tif, etc then you can not edit the text in the scan, because it is recognized as a picture not as text.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Printers and Scanners

From novice to tech pro — start learning today.