Solved

Scanner: FileFormat.

Posted on 2000-02-24
9
216 Views
Last Modified: 2010-08-05
Hi all,
 
Does somebody know a bit what's the best format / compression to use when scanning mainly letters (black & white)?
Scanned documents are saved to disc (more pages in one document are possible), after that some will be OCR-ed.
 
I know that there's probably not one good answer, but I would like to start with something good (to test / demo), and perhaps to change afterwards.

Other advise / comments in this area are very welcome!

By the by, at the moment I'm developing with the ocx components I imported from Imaging for Windows.

I was thinking about using tiff, but there are some possible compression settings (Huff, packed, LWZ), are those all Tiff formats?

Thanks in advance,
 
0
Comment
Question by:florisb
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
9 Comments
 
LVL 5

Expert Comment

by:TheNeil
ID: 2553648
PCX would give you great compression and no loss in quality when dealing with B&W images (i.e. text). It's also easy to write your own load and save routines for it

The Neil
0
 
LVL 10

Expert Comment

by:Lischke
ID: 2553715
TIFF is the usual format when it comes to scanning and it has all you need I think. There are also many compression formats available, like:

- dump mode
- CCITT modified Huffman RLE
- CCITT Group 3 fax encoding
- CCITT Group 4 fax encoding
- Lempel-Ziv & Welch
- 6.0 JPEG
- JPEG DCT compression
- Next 2-bit RLE
- #1 w/ Word alignment
- Macintosh RLE
- ThunderScan RLE
- codes 32895-32898 are reserved for ANSI IT8 TIFF/IT <dkelly@etsinc.com)
- IT8 CT w/padding
- IT8 Linework RLE
- IT8 Monochrome picture
- IT8 Binary line art
- compression codes 32908-32911 are reserved for Pixar
- Pixar companded 10bit LZW
- Pixar companded 11bit ZIP
- Deflate compression
- compression code 32947 is reserved for Oceana Matrix <dev@oceana.com>
- Kodak DCS encoding
- ISO JBIG

Since you are going for OCR I think the FAX encoding is the best choice for you (black&white), but I don't know for sure if it is lossless.

Ciao, Mike
0
 
LVL 2

Author Comment

by:florisb
ID: 2554438
Thanks so far.
Mike, you don't happen to know whether the Imaging for Windows ocx components support the suggested Tiff compression?

Tiff? Check: http://www.libtiff.org/

Sincerely,
Floris.
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 10

Accepted Solution

by:
Lischke earned 35 total points
ID: 2554576
Yep, I just tested the Wang imaging with the sample images for the TIFF library. It does crash on various formats but it can handle G3 fax encoding.

Ciao, Mike
0
 
LVL 2

Author Comment

by:florisb
ID: 2554699
Hi Mike,

Thanks, great. My only problem now is that I'm doing 25 things at the same time.

Would you mind to send me your (test?) project using Wangs ocx's? Would be great: florisb@satl.com

Neil, also thanks again. I'll stick with TIFF. I'm on this TWAIN maillist and have contact with some people that know a lot about TIFF.

By the by; three years ago I made an application that was placed after scanning and before OCR; it detected whether area's in a TIFF where placed horizontal. I not, it replaced this area in the TIFF to a 100% horizontal position. Was very complicated... ...you know what I mean if you know how a TIFF is build bitwise, all those 'bands'.
Well, never mind, social computer talk.

Good luck and please do post more info if you have! Or contact me in a few weeks if you like to know what I did with it.

Greetings,
Floris.
0
 
LVL 2

Author Comment

by:florisb
ID: 2554704
I hope you don't mind to post or send  the code.
0
 
LVL 10

Expert Comment

by:Lischke
ID: 2554761
I don't have a test project. There's an application (wangimg.exe) which uses the ocx and I have used the Imaging app to open all of those pictures.

Ciao, Mike
0
 
LVL 2

Author Comment

by:florisb
ID: 2555548
those components work together in a shitty way... but I'll manage.

Floris.
0
 
LVL 2

Author Comment

by:florisb
ID: 2555562
Was that a joke by the by...?
....let's hope Windows Imaging uses it's own components and it's own improved versions.

0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Have you ever had your Delphi form/application just hanging while waiting for data to load? This is the article to read if you want to learn some things about adding threads for data loading in the background. First, I'll setup a general applica…
In my programming career I have only very rarely run into situations where operator overloading would be of any use in my work.  Normally those situations involved math with either overly large numbers (hundreds of thousands of digits or accuracy re…
Nobody understands Phishing better than an anti-spam company. That’s why we are providing Phishing Awareness Training to our customers. According to a report by Verizon, only 3% of targeted users report malicious emails to management. With compan…
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question