Avatar of Michael Machie
Michael MachieFlag for United States of America

asked on 

Looking for OpenSource Bar Code scanning and storage solution

- Numerous PDFs in a network folder, or to be pulled into the solution via a network scanner
- Need to read the bar code and extract the 5 pieces of data for indexing. OR, OCR portions of the page with the same data as in the bar code.
- Use this data to store the document for search and retrieval later - methods may vary. Would like documents placed into folders by the date in the bar code.
- Some sort of compression or load into a database is preferred to keep file size down.
- Windows or Linux based
- OpenSource only: I want to get my hands dirty with it.

Any ideas?
Document ManagementPDFOCR* opensourceDocument Imaging

Avatar of undefined
Last Comment
Joe Winograd
ASKER CERTIFIED SOLUTION
Avatar of Joe Winograd
Joe Winograd
Flag of United States of America image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Avatar of Michael Machie

ASKER

Thanks Joe, I'll check those two out. I did look at ZBAR but didn't browse enough to see what it can really do.
Avatar of Joe Winograd
Joe Winograd
Flag of United States of America image

The command line help output from both is below. Should give you a decent idea of what they can do. Regards, Joe

usage: zbarimg [options] <image>...
scan and decode bar codes from one or more image files
options:
-h, --help      display this help text
--version       display version information and exit
-q, --quiet     minimal output, only print decoded symbol data
-v, --verbose   increase debug output level
--verbose=N     set specific debug output level
-d, --display   enable display of following images to the screen
-D, --nodisplay disable display of following images (default)
--xml, --noxml  enable/disable XML output format
--raw           output decoded symbol data without symbology prefix
-S<CONFIG>[=<VALUE>], --set <CONFIG>[=<VALUE>]
                set decoder/scanner <CONFIG> to <VALUE> (or 1)

usage: CommandLineRunner { file | dir | url } [ options ]
  --try_harder: Use the TRY_HARDER hint, default is normal (mobile) mode
  --pure_barcode: Input image is a pure monochrome barcode image, not a photo
  --products_only: Only decode the UPC and EAN families of barcodes
  --dump_results: Write the decoded contents to input.txt
  --dump_black_point: Compare black point algorithms as input.mono.png
  --multi: Scans image for multiple barcodes
  --brief: Only output one line per file, omitting the contents
  --recursive: Descend into subdirectories
  --crop=left,top,width,height: Only examine cropped region of input image(s)
  --threads=n: The number of threads to use while decoding

Open in new window

Avatar of Michael Machie

ASKER

ok, real good stuff, Joe - thanks again.
I will obviously need to dedicate a bit of time to this and watch your video etc. Give me a few days before a reply is made.

I can definitely work with PNG and JPG, no issues there, as long as I can find a way to reduce the sizes. I'm currently using a compression utility on the PDFs as the scanner will pull in about 250 pages at a size of about 200mb. With upwards to 1,000 documents a day storage is becoming a daily struggle.
Avatar of Joe Winograd
Joe Winograd
Flag of United States of America image

> as long as I can find a way to reduce the sizes
> storage is becoming a daily struggle

I hear you on that! I have the same issue with a client who is scanning to very large PDFs and needs parts of the PDFs split up based on QR/bar codes. After using ZBar and ZXing to read the codes, I'm making a call to the open source ImageMagick to trim into GIF files, which are very reasonably sized — my client is thrilled with the size now!

I haven't written about ImageMagick, but below are links to a few of my articles about GraphicsMagick that you may find helpful. GM is also open source and similar in capability to IM (in fact, GM is a fork of IM).

Reduce the file size of many JPG files in many folders via an automated, mass, batch compression method

Create a PDF file with Contact Sheets (montage of thumbnails) for all JPG files in a folder and each of its subfolders using an automated, batch method

Create an image (BMP, GIF, JPG, PNG, TIF, etc.) from a multi-page PDF

Convert a multi-page PDF file into multiple image files

I prefer GM over IM for interactive use, but it is easier to make command line calls to IM in programs because it offers a stand-alone EXE (have not been able to find something as easy to call with GM). Regards, Joe
Avatar of Michael Machie

ASKER

Appreciated.
Avatar of Joe Winograd
Joe Winograd
Flag of United States of America image

So I'm still not sure if ZXing handles PDF directly or, if it does, I don't know how well.
Hi Michael,
Following up on my own comment above, I tested ZXing with PDF files — does not work. There may be a version out there that does, but the command line Windows binary that I'm using (mentioned in my first post) does not. As I said earlier, it's not an issue for me, since I was already creating a PNG for ZBar (even though ZBar can handle PDFs directly). Regards, Joe
Avatar of Michael Machie

ASKER

Thanks Joe.
Avatar of Joe Winograd
Joe Winograd
Flag of United States of America image

You're welcome, Michael. Good luck on the project. Regards, Joe
Document Management
Document Management

A Document Management System (DMS) is a system (both hardware and software) used to track, manage and store documents and reduce paper. Most are capable of keeping a record of the various versions created and modified by different users (history tracking). It is often viewed as a component of enterprise content management (ECM) systems and related to digital asset management, document imaging, workflow systems and records management systems.

2K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo