Solved

Advice/Experience regarding online distribution of documents

Posted on 2008-06-12
1
357 Views
Last Modified: 2013-11-15
I'm new to Experts Exchange, but I believe many of you will find this very interesting.

The company I work for has acquired a "library" of industry-specific books, periodicals, photos, manuals, white papers, case-studies, articles, legal documents etc. This library literally weighs in at 4.75 tons. The books, periodicals, manuals and other bound media are being donated to a university with which we are affiliated.

The loose documents (articles, speeches, photographs, legal documents etc.) are of primary interest to me. I have been given a mandate to create a web-based, revenue generating distribution system for people to purchase archived copies of this material. Here's an example of how this might play out, at least in my mind :)  

1. Customer uses keywords to search our website for articles, court cases etc.
2. Search uncovers content Customer is willing to purchase
3. Customer goes through the buying process (create user/pass, verify info, complete payment etc.)
4. Once payment has cleared, Customer downloads document

That's it. Very simple, right?  Yeah.... right.

Here is where I need advice...
* Best Practices related to scanning/archiving all manner of documents (I will be using a Fujitsu fi-5750C w/ onboard VRS Professional)
* Creation of a secure, searchable archive used to categorize, sub-categorize and cross-reference thousands of documents, photos etc.
* Electronic delivery method of files to Customers
* Prevention of unauthorized duplication (impossible task, but still...)

Any feedback, experience or advice you can provide would be greatly appreciated.

Thanks in advance,
Kabe

PS - EE indicates I am "Advanced" in this area. I am not. :(
0
Comment
Question by:kabelittle
1 Comment
 

Accepted Solution

by:
adamflores earned 250 total points
ID: 21775343
Sounds like you have the tools you need to create the files from your documents The question is do you plan on creating text documents or .pdf's? Text documents are nice because they are easily converted to html and are directly searchable, but there is no security. You would have to copyright your individual archives and your website to give you legal protection but .pdf's gives you the extra protection of being able to password protect them so they can't be modified, but there is no way to stop them from giving it away to others. Even with Digital Rights Management you can't stop duplication you can only look for distribution and go after those that facilitate distribution so there again comes back to copyright protection. That is why the movie and music industries are suing teens and grandmothers for sharing violations. The other thing is OCR technology is not perfect either way you will have to scan the archives and create something like MS Word documents to proof the text before creating a pdf unless you intend to scan as an image and pdf the image which might make it less searchable. I know google has technology to search text in pdf documents, but it wouldn't work if the pdf is of an image.
This is obviously a monumental project. You will have to take it in chunks.  First decide what kind of file format your archives are going to take and what you intend to sell to the customers. Then you need to decide where you are going to host your files. This might depend on what kind of internal infrastructure you have and if you have the staff to do the in-house development or if you plan to farm out the development and/or hosting. After all of this work you need to decide on a business model. It is easy to say you want everything automated so you don't need a staff to run this business, but that might require a lot of customer work. It might be more practical to setup a website that takes the orders and a live person fulfills the orders by sending the requested file(s) at least in the beginning. Finnaly once you are ready to start making sales how do you plan on reaching your customers? How will people know if they want to be your customers? In short how do you plan on marketing this business? If you are interested in outside development and hosting I'd be interested in bidding.
Best regards,
Adam Flores
0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

Read about achieving the basic levels of HRIS security in the workplace.
Many companies are looking to get out of the datacenter business and to services like Microsoft Azure to provide Infrastructure as a Service (IaaS) solutions for legacy client server workloads, rather than continuing to make capital investments in h…
This tutorial demonstrates how to identify and create boundary or building outlines in Google Maps. In this example, I outline the boundaries of an enclosed skatepark within a community park.  Login to your Google Account, then  Google for "Google M…
This video teaches users how to migrate an existing Wordpress website to a new domain.

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now