Solved

Best format to scan documents to for long term use PDF JPG TIF BMP or other

Posted on 2014-01-25
12
2,164 Views
Last Modified: 2014-01-25
I'm scanning old documents that likely will never be looked at again or once in the distant future. Cleaning out filing cabinets - old tax bills, closing statements from property my uncle owned and sold years ago, old school report cards, old wills of uncle & father that died years ago / estates are settled.  

Likely just things our kids / grandkids / etc will look at to reminisce (sp?) and not much more.

Best way to scan?  they are text docs so I am scanning at 200DPI - it's the text and overall appearance that's important.  Not to zoom in and see the nuances of the font used, etc.

And then what format?  PDF? JPG, something else?  A couple things are 2 page docs, so as a PDF, it's nice that the 2 pages can be in a single file.  There's no 2 page JPG, right?  TIF would do that, right?

But who knows in 50+ years what formats will still be readable?

care to guess?
0
Comment
  • 4
  • 3
  • 3
  • +1
12 Comments
 
LVL 34

Assisted Solution

by:Dan Craciun
Dan Craciun earned 143 total points
Comment Utility
Well, I'm willing to bet that in 50 years there's still going to be some way to read a pdf or a jpeg/png. You can still listen to audio tapes and vinyl, right?

PDF is the most comfortable format. For multiple page documents and the ability to OCR on the fly.
If you're concerned about quality and scan in b/w or have few colors, I would advise PNG. It uses lossless compression, so you're not losing quality by saving the file. And if you have few colors the file size won't be huge.

HTH,
Dan
0
 
LVL 51

Assisted Solution

by:Joe Winograd, EE MVE
Joe Winograd, EE MVE earned 215 total points
Comment Utility
Do not use BMP or JPG. You are correct...they are single-page only. When I started out in the scanning/imaging business 25 years ago, I would have recommended multi-page TIFF, and that's still a fine choice. But there's no denying that PDF has become ubiquitous, so I'd say PDF is fine, too. I have no doubts that multi-page TIFF and PDF will be readable in 50+ years from now. The much bigger issue is the media they're stored on. Imagine trying to read an 8" or even 5 1/4" floppy disks today. Regards, Joe
0
 
LVL 90

Assisted Solution

by:John Hurst
John Hurst earned 142 total points
Comment Utility
I do what you are doing and I use PDF format. It works fine. I normally scan in 300 DPI because the disk space for 200 DPI is not that great. Resolution is better at 300 DPI.

Also look at your scanner settings. If I think I want to search the file down the road, I scan in Searchable PDF. I only do a few this way.

PDF is the way to go for the foreseeable future.

.... Thinkpads_User
0
 
LVL 51

Assisted Solution

by:Joe Winograd, EE MVE
Joe Winograd, EE MVE earned 215 total points
Comment Utility
A few more comments. I just saw Dan's post and my suggestion is not to use PNG. For the docs you describe, I think TIFF and PDF are better. I'm personally scanning nearly everything to PDF Searchable Image files – that's a PDF file that has been through an OCR process so it has both an image and a layer of (searchable!) text in it. For some docs (rarely), I'll scan to an image-only PDF (if I don't think the OCR process will work well on it). Also, I scan almost everything at 300 DPI, black&white (1-bit). Occasionally, I'll scan at 200 DPI grayscale (8-bit) and on even rarer occasions, 150 or 200 DPI color (24-bit). Regards, Joe
0
 

Author Comment

by:BeGentleWithMe-INeedHelp
Comment Utility
yes, thanks guys - media issues - yeah, I was going to joke with you dan about how it's not all that easy to listen to vinyl or or audio tape (reel to reel, 8 track).  I think the next 50 years will go 'faster' than the last 50 - bigger changes.  

I'm keeping these files on my 'data'drive with old and active things.  Yeah, the idea of archiving things could be another question.  But these and my current quickbooks file are on the same 1TB drive, get backed up with shadow protect, etc.  so when drive types change / I get a new machine, these docs and the quickbooks / all pics will move together to the new drive.  so media won't be as big an issue in the future?  (things were put on floppy because hard drives were expensive / small.  then you forget about the floppy with that important data till it's long gone from current machines?).  With bigger / relateively cheaper drives, you can keep more data 'live' (more likely to get corrupted - the drive is spinning all the time, 1 bit changes and the file is SOL?).

I'm stuck in the house, snowbound, so a little cabin fever.  I'm up for the conversation if you are.

Dan - HTH means hopes this helps?  cute.  hadn't seen that before.
0
 
LVL 90

Assisted Solution

by:John Hurst
John Hurst earned 142 total points
Comment Utility
my current QuickBooks file are on the same 1TB drive

QuickBooks (or any other like financial system) is a VERY different question. There is NO guarantee that QB version 2025 will read your old file. People have difficulty now trying to upgrade a QB V2002 file to QB V2013 or 2014.

So if you wish to archive QB data, open the ledger, and save General Ledgers and Trial Balances.

Alternatively, purchase the new version of QB each year and upgrade the ledgers as you go along. I do this and it works just fine.

.... Thinkpads_User
0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 34

Assisted Solution

by:Dan Craciun
Dan Craciun earned 143 total points
Comment Utility
I think it's the first time me and Joe openly disagree ... somewhat :)

I did say pdf was the most "comfortable" format. BTW, you can OCR directly from inside Acrobat (Tools->Recognize Text), regardless of when you scanned the original document.

But, coming from a design background, if the appearance of the original document is important then I don't trust lossy compression. Actually png are ideal for archiving because they keep every pixel as you scanned it. The downside is that they are usually bigger than jpeg's, but not if you're keeping in 8 colors.

It's about the same conversation as what's best for audio archiving: mp3's or flac? I would always go for flac, simply because I can recreate the source with 100% fidelity.

Yup, HTH means hope this helps. I borrowed it for about 10 years now, from a message board somewhere :)
0
 
LVL 51

Accepted Solution

by:
Joe Winograd, EE MVE earned 215 total points
Comment Utility
Yes, just make sure you keep moving all of the docs to your new computer and you'll be fine. The danger is if you archive them to external media and then that external media becomes obsolete with no devices around that will read it. But as long as the docs stay on your latest-and-greatest BelchFire 9000, you're good.
0
 

Author Comment

by:BeGentleWithMe-INeedHelp
Comment Utility
as I type I am scanning to PDF.  Don't need / want to go through the trouble of OCR - there's some proofreading / checking you need to do to see that it got it right? It's hard enough getting the time to scan, let alone look at each doc.

2 other questions I have to post today:

what IS a good program for organizing pics and docs to make them easy to search.  (as I say I don't want to do OCR).  But pics aren't going to do OCR anyway. want to be able to search on keywords that have check boxes (don't want to have to worry with free form keywords, I sometimes type WDW other times Walt Disney World, etc.  Just set a keyword? 'WDW' and then I'll see that in a list of words as I view a doc / picture and can check that box.

http://www.experts-exchange.com/Software/Photos_Graphics/Images_and_Photos/Q_28348236.html
And PDFs - as much as it's a standard, why do some PDF writers make a 20kb file vs. other apps would make a 100kb file for the same document / image?  which is best.

http://www.experts-exchange.com/Web_Development/Document_Imaging/Adobe_Acrobat/Q_28348235.html
0
 

Author Comment

by:BeGentleWithMe-INeedHelp
Comment Utility
Oh, I still have the Belchfire 8000. Have to upgrade : )
0
 

Author Closing Comment

by:BeGentleWithMe-INeedHelp
Comment Utility
thinkpad - YES!  quickbooks would be upgraded every couple years.  I used a bad example  there - just describing actively used data vs. things I scan and forget about.

Yeah, anyone have VisiCalc files they need opened?!
0
 
LVL 90

Expert Comment

by:John Hurst
Comment Utility
For organizing files, I use Windows Explorer. I have developed file categorization that works fine with Explorer.   If you do not scan Searchable, then you can only scan for file name in most cases.

Why do you get different sizes?  You asked that as another question and I answered there. It depends entirely on the scanner (settings being equal). Different scanners (Xerox and HP say) produce different PDF sizes.

.... Thinkpads_User
0

Featured Post

Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Suggested Solutions

The foremost challenge encountered by an investigator at the very beginning of a forensics investigation is, accessing a file/data to read/view its contents. Owing to the fact, a platform is necessary for both; opening as well as examining any file.…
Use email signature images to promote corporate certifications and industry awards.
The goal of the tutorial is to teach the user how to use import presets downloaded from the internet into Adobe Lightroom. Once you downloaded the presets go into the preset folder and press import then import your preset and your set it to go.
Sometimes we receive PDF files that are in the wrong orientation. They may be sideways or even upside down. This most commonly happens with scanned or faxed documents. It is possible to rotate the view of these PDFs with the free Adobe Reader produc…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now