• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1231
  • Last Modified:

Retrieving OCR document using WIA

I'm using a application that retrieves documents from a Kodak i30 scanner. The application uses WIA to interact with the scanner. I retrieve pdf documents and I really want to have a digital text document instead of a picture. Because the application I work with will link the document with a hyperlink to an activity in the CRM application, it is most important that the first result is already a "OCR" document. My question is:can the instruction in WIA be ammended in that way that the outcomming of a scan action will be a digital text doument?
Many thanks!
John
0
jkruijt
Asked:
jkruijt
  • 8
  • 6
  • 3
1 Solution
 
Paul SauvéRetiredCommented:
Hi jkruijt,

You do not specify the software you are using. In any case, you should be able to select the OCR output in the software options. WIA is compatible with OCR scanning.

pauls
0
 
jkruijtAuthor Commented:
Hi Pauls,
I'm using a CRM application. That application is talking WIA to the scanner. As a result of this, the scanner delivers a pdf. Now is the question how do I code the wia instruction to let the scanner deliver a pdf that is not only a picture but also a text document.
regards,
John
0
 
Paul SauvéRetiredCommented:
Hi j,
It seems CRM is a generic term for "Customer Relationship Management" software. It is a suite of software tools: Service Management,  Product Data Management,  Planning & Scheduling,  Master Data Management,  Compliance Management,  Supply Chain Management,  Web 2.0,  Production Management,  Human Capital Management,  Enterprise Performance Management,  Customer Relationship Management,  Sales Management,  Epicor Tools and Technology,  Financial Management.

Unless there is  an OCR tool bundled into the suite and associated with the WIA, then you have to find another solution.

But CRM is not a brand of software, it refers to what the software suite is used for. Again, if I knew the name of the package (Terrasoft CRM, Jitbit CRM (web based software), Microsoft Dynamics, NetSuite CRM+ Software)

pauls
0
The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

 
jkruijtAuthor Commented:
Hi P,
The name is SCOPE and it is a local Dutch software. But I believe they have told me they "talk" WIA to the scanner. You told me that WIA is able to initiatie a OCR scan. The scanners default behavior is to scan searchable PDF's. So I need to know which instuctions the scanner needs to get (in WIA) to deliver a searchable pdf.
Regards,
J.
0
 
Paul SauvéRetiredCommented:
Hi J,

Well, I did search with Google & I didn't find anything concerning SCOPE CRM software. But, as I mentioned in my previous post, WIA (Windows Image Acquisition) supports OCR. You will probabally have to contact your vendor or SCOPE to find out how to activate it. Sorry I could'nt be of more help.

P.
0
 
wyliecoyoteukCommented:
WIA will not "do" OCR for you, it is a protocol to control the retrieval of an image from the scanner hardware.
The software needs to process it after retrieval.
If SCOPE does not have an OCR module, you are out of luck. Often OCR is an optional extra.

It works like this:

Scanner->WIA protocol->WIA API->Software processing.


0
 
wyliecoyoteukCommented:
Just to clarify, there is no such thing as an "OCR scan". A scanned image is essentially a photograph of a document, and  OCR is applied to an image after it has been captured..
0
 
jkruijtAuthor Commented:
@wyliecoyoteuk

Thanks for your reply. From the Microsoft site: "WIA minidriver. A WIA minidriver is the most versatile and extensible model provided by WIA. Using a WIA minidriver, IHVs can expose extensions, private interfaces, custom settings, and device-specific behavior. For example, a scanner manufacturer may choose to write a WIA scanner minidriver that exposes OCR functionality to applications."
See picture I have added from the same site.
So I believe WIA can give instructions to a scanner to deliver a searchable PDF (OCR).


wiaarch.gif
0
 
wyliecoyoteukCommented:
"For example, a scanner manufacturer may choose to write a WIA scanner minidriver that exposes OCR functionality to applications."

That means that a manufacturer can choose to use, or allow via an API, software to directly manipulate the image to OCR it as it is retrieved, not that a scanner can be instructed to OCR it.
Essentially, WIA supports on-the-fly OCR, but it does not supply it.
A scanner is basically a big digital camera, so unless your scanner comes with an OCR software engine, or your CRM package includes an OCR engine, WIA cannot be "told" to OCR an image.

There may be 3rd party packages that can sit "in between" your application and the scanner, but that would again depend on the availability of an API.

http://www.theleagueofpaul.com/blog/2008/09/23/codesnippet-scanning-with-wia-ocr-with-office
http://www.leadtools.com/sdk/wia/default.htm
http://www.lucion.com/terminology-wia-scanning-software.html

0
 
jkruijtAuthor Commented:
Hi There,
We are still not on the same track. I maybe a bit confused and have given you the wrong info. What I want to accomplise is that the scanner delivers a searchable pdf. That is what the scanner does, when no extra software loaded, just when I push the start button. Now, I don't push the start button but I push a button on the screen. The software (SCOPE) will push the scanners button (using WIA) and in the WIA string they have to defne what the outcome of the "push on button" needs to be. I know the can manipulate the outcoming for example the kind of file. In the same way there must be an instruction set to let the scanner know (using WIA instructions) to scan to a searchable pdf instead ofa TIFF file.
I hope I have explained myself more clearly now.
Regards, John
0
 
wyliecoyoteukCommented:

The Kodak brochure states:

File format outputs

    * BMP, TIFF and JPEG (searchable PDF with bundled software)

So in other words, the software that comes with the scanner recognises the button push and OCRs the output.

Unfortunately, the specification page also states that it is not a WIA scanner, it only supports TWAIN or ISIS. This may be out of date, but even so, WIA is a cut down Windows dependent image capture interface, much like GDI for printers, it does not have OCR capability, it requires WIA compliant hardware, and all processing takes place on the host PC, using the manufacturer's driver and/or software.

http://www.dosadi.com/stiwiatwain.htm

If there is a way of inserting the OCR'd file into the application, it would need to be done via the Kodak software, and any commands would be specific to Kodak.

If the application does not have OCR capability, it may be possible to scan to a predetermined folder, and have the software pick it up from there.
This can be an automated procedure. Many DMS packages do it (INVU and Laserfiche are 2 we sell that do so) it may be included or an optional extra.

Recently, some (large, expensive) scanners are starting to have OCR capability included in the scanner firmware, but the 130 does not seem to be one of them.
0
 
jkruijtAuthor Commented:
HI, Sorry for the trouble but I see I haven't correct the scanner type. I state that I have a i30 but that is incorrect. We use a Kodak i1320 plus. Again sorry for the trouble.
Regards, John
0
 
wyliecoyoteukCommented:
OK that scanner is WIA compliant, but it still does not change the general process.
WIA processing takes place on the PC, and so does the OCR.
So unless your application can connect to the driver's API, via WIA or otherwise, there is no easy way to import OCR'd documents into it.
0
 
jkruijtAuthor Commented:
Okay, the application has to connect to the drivers API, using WIA. Now is the question: what is the command string that will the scanner enable the searchable pdf? That is still my main question.
0
 
wyliecoyoteukCommented:
As stated a long way back, that would depend on whether it is possible or not.
Have you tried asking Kodak?
If it is, it woulkd seem to be
IWiaItem Interface

http://msdn.microsoft.com/en-us/library/ms630113%28v=VS.85%29.aspx

IWiaItem::AnalyzeItem Method

http://msdn.microsoft.com/en-us/library/ms630010%28v=VS.85%29.aspx

0
 
jkruijtAuthor Commented:
@Wiliecoyoteuk
I have asked Kodak and they are not willing to help me. If you also can't I have to close this question.
The question is not solved but wasn't solvable either so I will give you the points because of the trouble you have done to assist me in this.
Regards, John
0
 
jkruijtAuthor Commented:
Question was not solvable. Points given based upon the support from the expert.
0

Featured Post

Take Control of Web Hosting For Your Clients

As a web developer or IT admin, successfully managing multiple client accounts can be challenging. In this webinar we will look at the tools provided by Media Temple and Plesk to make managing your clients’ hosting easier.

  • 8
  • 6
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now