Solved

Xerox WorkCentre 7830 scanning documents to DOC files instead of PDF's

Posted on 2014-12-03
8
524 Views
Last Modified: 2014-12-30
Hello I have a Xerox WorkCentre 7830 when I scan a document the only options I have is PDF, PDF/A, XPS, TIFF, JPEG but I need to scan in some documents and output to Word documents instead of pdf's ....I will be scanning in about 5000 documents so I am not sure if I should look into allowing the scanning to make all those pdf's and then find some app to convert pdf's to word doc's or do you guys have a better suggestion ...
0
Comment
Question by:Deerek11
  • 3
  • 3
  • 2
8 Comments
 
LVL 18

Expert Comment

by:web_tracker
Comment Utility
obviously you are looking at something that you can edit in word. Since the printer does not have MS office installed, it can not convert a scanned document into a doc file. This would make the cost of buying the Xerox printer expensive because it would require a license to use the MS office suite.  If you scan to an application on your computer such as word, then the word application would open up, but the scan would most likely be picture pasted into the word document. Something that can not be edited. I think the best way to work around this is scan it as a pdf and then convert it to a word document using some type of conversion software.
0
 

Author Comment

by:Deerek11
Comment Utility
Any suggestions on a free to low cost application that can handle 5000 pdf's to doc's
0
 
LVL 38

Accepted Solution

by:
Herman D'Hondt earned 334 total points
Comment Utility
When you scan a document, all you end up with is pixels (dots on the page). There is no text at all in a scan. If you need text, you would need to start with a text-recognition program. As you require something low cost, you can try something like Free OCR to Word.

As the scanner will still only produce image files, the easiest way to do it is to scan from within the OCR software, which Free OCR to Word will do. Unfortunately, you then need to manually scan each page. Another way is to use a package that can scan from a command line, and then run a script that converts each file as it is scanned. Something like SimpleOCR can do that.

Another problem is multi-page documents. The printer may be able to create multi-page PDFs, but most OCR packages require TIF format, which limits them to single pages. Your best bet would be a high-end program like OmniPage. That would do the whole job at once. It will take a full directory of TIF or PDF files, and output them into DOC format.
0
 

Author Comment

by:Deerek11
Comment Utility
So are you saying that after I scan in all the documents and it created pdf's that I will still need some OCR software or can I just use a pdf to word convertor application. ...Also the scanner can create XPS file would this be easier then pdf's
0
Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 
LVL 38

Assisted Solution

by:Herman D'Hondt
Herman D'Hondt earned 334 total points
Comment Utility
No matter what output format the scanner gives you (PDF, TIF, JPG, XPS, etc), the file will still not contain any text or characters. It is effectively a photo of your document. If you look at that photo, the OCR in your brain lets you read the text. In the case of a computer, that photo needs to be converted to text by OCR software.

Even if you used software to convert the PDF to DOC format, what you would then have is a Word document that contains 1 large image per page - no text. And, if there is no text, you cannot search it or modify it. Hence the conversion is pointless: you may as well look at the image from a PDF as from a DOC.
0
 
LVL 18

Assisted Solution

by:web_tracker
web_tracker earned 166 total points
Comment Utility
When a document is converted from pdf to a Doc, most of the text will be recognized as text and CAN be edited. I have just confirmed the fact that it does work. by scanning a poster that had both text and graphics. I scanned it as a pdf and then used adobe Acrobat standard edition to convert it to a word document and it showed the special graphics as a picture and the text it was normal text that could be edited.  You do not need adobe acrobat pro, as adobe acrobat standard with convert it to word. Often when you buy a high end scanner adobe acrobat standard is shipped with the scanner. That's how I obtained my copy.
0
 
LVL 38

Expert Comment

by:Herman D'Hondt
Comment Utility
@web_tracker

You are correct - as long as the PDF contains text. Unfortunately, a scan does not contain any text, only an image.
0
 
LVL 18

Expert Comment

by:web_tracker
Comment Utility
If you scan a document as a jpg or any other format then the whole scan would be a bunch of pixels and you can not edit the text on the page, but if you scan as a pdf any pictures will not be editable but the text in the pdf document will be recognized as text and it can indeed be edited. I have tried it and I was able to edit the text after scanning a poster to a pdf, I was able to edit the text portions of the poster after converting the pdf to word document. You do not need to have OCR software to convert the pixels to letters, this is already built into the pdf software. Years ago you needed ocr software to convert the pixels to text, but the pdf software such as adobe acrobat has the ocr software built into their software. I know this from experience scanning documents to a pdf and then being able to edit the text on the scanned document. Note if you scan in a picture format such as jpeg, gif, tif, etc then you can not edit the text in the scan, because it is recognized as a picture not as text.
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

How to solve seemingly unsolvable printer issues. Users sometimes run into printing issues where all the normal steps do not seem to work. Well the steps below can show users how to take one extra step beyond the normal steps needed to remove old…
This seems to be a very common error related to the Samsung printer driver. First, this is the error we're talking about: Log: System Type: Error Event: 7000 Agent Time: 3:37:24 am 22-Apr-09 Event Time: 6:07:24 pm 21-Apr-09 UTC Source: Se…
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…
In this tutorial you'll learn about bandwidth monitoring with flows and packet sniffing with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're interested in additional methods for monitoring bandwidt…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now