asked on

scanner code

I need code that will scan an image into a Access database

ASKER CERTIFIED SOLUTION

srirambm

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

waty

For my part, I use the Kofax OCX, very easy and powerfull to use.
I use it with 3 <> scanners.

Anyway, here are some other links :
http://www.avdf.com/feb99/art_r001.html
http://www.accusoft.com/Digital_Imaging/ImageGear/ImageGear.htm
http://www.twain.org/index.html
http://www.dosadi.com/download.htm

Flatbed scanner, Hand Held, or any TWAIN compliant device can be accessed
in Visual Basic with the EZTWAIN DLL (Included free as freeware).
I like this much better than the KODAK OCX for scanning.
This uses less resources and less coding. Don't flood me with a ton
of questions on this project.
Once you use it, you will know as much as I do.
REMEMBER to copy the EZTW32.DLL into your Windows System Directory!!!!!

http://www.planetsourcecode.com/vb/default.asp?lngCId=3924&lngWId=1

mark2150

Jeeze, you DON'T want to keep scanned files directly in the access database! This will cause your database to rapidly bloat and will severely impact image retreival and display time.

Only keep a *pointer* to the scanned file in the database. Keep the images out on disk. If you store the full path you can index many more images than any single disk can hold without your database getting out of hand.

To display the images you'll normally need a picture box or other control. All the standard controls use the LoadPicture({filenspec}) method where they get their input from *disk*. If you have the image in the Access database you'll find that you have to pull the image out of the database, write it to disk, THEN load it into the control. It will be twice as fast to just load the image directly from disk.

Image files can get *HUGE* in a hurry. Access databases can only hold a couple of GB before they run out of gas. If you're creating large (10Mb) images then you can only hold about 200. If you save the *pointers* then you can easily hold 200 *thousand* images with room to spare.

Anyway the other experts comments about the scanning software is valid. The scanner I bought came with a "cover sense" software that as soon as you lifted the cover the scanner software fired. Scan, & save to disk and your app can sense the new file and automatically add it to the index. Clean, simple to code and straight forward for the operator to run.

I've built a document imaging system that holds several hundred thousand pages spread across four online volumes with more archived on CD. Works like a top. About four seconds from selection of canidate document to page display. Trust me, you DO NOT want to save images *in* the database.

M

Robertwilliams

ASKER

I need time to review this answer.

Sincerely
Robert Williams

mark2150

If you're interested in the source to the document scanner, let me know...

M

waty

mark2150, I am quite interessed by your code, as I wrote quite a same thing for my work, I would like to know how you did it :)
If you are interessed, I can show you how I did mine.
Could you contact me at waty.thierry@usa.net :)

mark2150

I cheated. I used the HP document scanner software that acts like a copier. It scans and immediately prints. I installed Adobe Acrobat on the machine and told the HP software to print to the Adobe printer! Presto! I have a .PDF file!

This particular customer is a collection agency. They get cartons of paper in from their clients every week. These are the source documents. Each piece of paper has a client and debtor number associated with it. Once the images have been scanned a clerk goes thru the Adobe document and uses the Note feature to drop a note on each page. The text of the note is two lines, first is client, second is debtor. When this process is completed the .PDF is saved to an alternate directory.

Once all of the files have been tagged, my EXTRACT program comes into play. It looks on the currently active LAN drive for matching filenames (the files are named for the client) If it finds matching names it appends the new data to the old. If no, it just copies the file. Once all files have been moved to the LAN it then scans *all* of the files and extracts the notes. The note information, along with the page number, name of the file and disk volume ID it resides on is all saved to an Access database on the LAN. This module also monitors disk volume usage and limits the amount of data on a volume to what will fit on a CD. It is also used to create CD's from the volumes and to initialize the volumes. It even prints the covers for the CD "Jewel Cases".

(The client has a 4, 1Gb, Jaz drive server and we keep all four spindles full of data. Since we limit to CD capacity that means we typically have 650Mb/volume or about 2.5Gb online at any given time. A typical scanned page is 20kb so we've got over 100,000 pages of data available at any given time!)

The above is the "back end" process. The "front end" is used by the collectors. I have another little program, FINDER, that allows the collectors to plug in a client and/or debtor number. The shared Access database is searched and a list of canidate documents is generated. The Access database keeps extending itself as CD's are created and maintains the complete document list going back to the day the system was installed. It not only knows what documents are online, but also knows what documents are on CD and, more importently, WHICH CD the document is on.

The collector clicks on a document and inside about four seconds (for online documents) the page is displayed in Adobe Reader. If the requested page is on CD, a dialog pops telling the collector what CD to install in their local machine to get the required document.

The collector can also click on the [Print] button in the FINDER app and not only will it call up the page, but it'll submit it for printing automatically.

This is *WAY COOL* and runs like a truck. Right now I'd say we have at least 1/2 million pages indexed on the system (It's been running for a year...) with, say, 125,000 pages accessable online and the balance on CD.

M

waty

For my part, here is the system on wich I work :

Each day, we have 8 peoples wich cut the press and get all the intersting articles.
All those articles are scanned in temporary files.
After the scanning, those articles are indexed in an Oracle Database (I store only the links) with several keywords and informations related to the articles.
The files are store as TIF file.
At the midday, we generate arround 500 profiles to print personilized Press Review on 3 Xerox Printer (130 PPM)

They are also converted to PDF after using some OCR recognition. Those PDF files are exposed to our Intranet to make some search in the daily Press.

We use VB to scan with the Kofax OCX, and also VB interface to index all those articles.
The DB is Oracle and we have arround 10 process (in VB) wich do some processing on the server (extracting images, creating SGML files, creating PS files, OCR, ...)

We have also a system (in VB) to make search in all the DB using all the keywords....

This system is very fast, and relialble (NB : We have moved the DB from Unix to NT last week)

We have a DB containing more than 800000 articles.

mark2150

Yep. The only way you can keep that volume of data indexed is to keep the images *outside* the database. Access databases have a 2Gb limit and both our apps are holding far more data than that with relatively small .MDB files.

One Q tho Waty, since the images are in .PDF why not just use the Acrobat search feature? It's plenty fast. Or ar you searching accross files?

M

waty

In the intranet version, we could make search in the PDF version, but most of the search are done using the keywords, ....

Robertwilliams

ASKER

I'm interested in program code. Some details:

I practice Land Surveying.
I'm constantly reviewing, comparing deeds (property descriptions).

I need a database where I enter the Book-Page field, then scan the image.

The deed may be more than 1 image, if so, the next record is filled in and scanned.

Sincerely Robert Williams

srirambm

well if you are not a seasoned programmer, you are not going to get this done by yourself.Best bet will be
to hire a parttime guy to do the job.
There are good experts here who may be willing to do the job for some real cash.

waty

Robertwilliams, let me know, I could be interessed :)
waty.thierry@usa.net

mark2150

Sounds really similar to my app with Book-Page instead of Client/Debtor. Sand off the legends & the cores should be almost identical.

How many documents we talking about here?

M

Robertwilliams

ASKER

Mark

Some quick math:

1 complex boundary problem
about 15 parcels (abstracts), each abstract about 50 images
way less > 1000 images

I do about 30 of these jobs a year.

client/debtor is more than likely like Book-Page

Thanks
Robert Williams

mark2150

So you're still looking at over 20,000 images per year!

15 X 50 = 750 images/job

30 X 750 = 22,500 images annually!

Yep, you can use my stuff alright! - HEY WATY! HANDS OFF, I saw him first! (grin)

(actually Waty, go for it - I just had to razz you a little bit!)

M

waty

hehehe :)

mark2150, is it possible to take a look to your application (or code?)
you can reach me at waty.thierry@usa.net