offline index / search application for pdf document cd-rom's

Hi experts,

I have a few cd's contains e-books, datasheets and various documents  in pdf format. I am looking for an application that can index and perform search on the contents of indexed files.

This program should have the ability to:
. create full text search index from pdf files
. work offline, without inserting the library cd

additional formats beyond pdf or reading into archive files will be nice but pdf is a must. I am not interested in mp3 / audio cd catalouge programs, please exclude them from your recommendations :-)

program may  be freeware or shareware.

If you know of such a program please let me know.

thanks
LVL 4
alikoankAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

nobusCommented:
0
nobusCommented:
0
alikoankAuthor Commented:
Hi nobus, thanks for reply

I actually stumbled across the first url when I did a search on EE, it contains many programs,I was rather hoping a personal opinion/recommendation from someone who had actually used a product like this.

I take look at the second URL, Jaws pdf creator is just an editor, it does not create offline index files.
0
Cloud Class® Course: CompTIA Cloud+

The CompTIA Cloud+ Basic training course will teach you about cloud concepts and models, data storage, networking, and network infrastructure.

mchorghCommented:
This may not answer your question directly but may address your issue. If you are creating a solution for a client you may disregard - my response relates to personal archives.

I have numerous electronic documents in my library and have struggled with many methods to recall particular info. Most documents are not created equally, in that the metadata is not properly defined. If you want a full text search of PDFs, you might as well store the document online. My suggestion is more of a direction to go and encompass all file types.

1. I would spend a little time upfront to properly name the document. For example: <Author><Title><Description><Publisher><Pub. Date>.<Extension>.
2. Write to CD to archive. Name Archives consistently and with date, e.g., Fiction - 2/5/2005.
3. Index the Archive manually using WSH: dir /b /s >> "c:\Archives Index\Fiction - 2-5-2005.txt". Then store labeled Archive offline in chronological order.
4. Index with Google Desktop (free) or the standard Windows indexer.

Whatever method you use, remember to keep it consistent.

Regards,



[conceptually try
0
alikoankAuthor Commented:
Hi mchorgh, thanks for your reply,

I am looking for a solution for my own library, I am open to suggestions.

My library consists of literally thousands of small files, some of them are named in a way you proposed but renaming all to a standart is practically impossible, that is one of the reasons I want an application to do the indexing.

Google desktop unfortunately does not index pdf files. I found out that Adobe has a plugin for Microsoft Indexing Service but I am not sure about its performance, how can I backup or move index database or how will it cope with CD drives (offline content).
0
jodygliddenCommented:
This product should do it for you.  I use it at school.  It will jump to the exact part in the pdf that contains the text strings as well which is handy for large pdf files.

http://www.isysusa.com/products/desktop/
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
alikoankAuthor Commented:
Wow, that really looks like good. Just one thing, does the original file have to be present for it to work? I mean to use it for CD-ROM's and there was no explicit informaton on the website about this subject.

thanks.
0
jodygliddenCommented:
Yes, if you want to be able to link to the data.  I don't think you'll find one where the file doesn't have to be there in order to see the info just because of the nature of storage.   Good luck
0
mchorghCommented:
Thanks.

ISYS is an enterprise product and its cost may be prohibitive for personal use.

Try http://www.docsearcher.tk/. It is free-use software in java, so you will need Sun's Java VM.

=========================

If the above did not help, I think you will find that you will have to bite the bullet on all those back files. Properly name the 10% of the files you use regularly; use sort by file accessed in Windows Explorer, and maintain the naming paradigm for subsequent files. Achieving perfection will be futile, just maximise your efforts. I literally have ten of thousands of files and renamed groups in batches a few years ago. I use a free-use software, Scarabée Software - Siren, for renaming. It is superb and can capture some useful metadata from the document itself. For example, you can group docs in a folder titled "Taxonomy & Classification," and rename the contents inserting the name of the parent folder within the original filename.

Full text indexing of PDF docs will be CPU and time intensive as the files are images and will require conversion to text first. Google online full text indexes Internet available PDF docs. If you have freely available docs, use Google online first for your full text searching needs.

I think that developing and adhering to a standard is easy relative to the capturing of metadata and renaming if you so choose. Google Desktop will index the names within the text file (from the dir command). I also suggest if you do not know a particular metadata, maintain its position of use a placeholder. For example, from the previous post, if you do not know the Author use "NA" then delimit. There will come a time when someone will create a cost effective solution and you will need to convert that flat file into that database. Properly (to your needs) naming the docs will also help when you process and access information through more structured means, i.e. sytematics (where you don't know of the information), as opposed to knowledge based (where you know of the information).

You are correct the Google Desktop Beta does not index PDF; it is my understanding that they are working on this feature. Also Microsoft through MSN will release a similar search/indexer very soon.

I am aware of other solutions that are a tad more complexed to setup, easy and free to use, and a few that are enterprise robust and costly.

Regards,


0
jodygliddenCommented:
Actually, ISYS has many versions and the one that I mentioned ISYS desktop is not enterprise.   ISYS desktop is perfect for what you're doing.
0
mchorghCommented:
I stand corrected, if it is so. Thanks.

However, if my recollection serves me well, ISYS Desktop 6.0 retails just under $600. I realize that the actual purchase price could be less. ISYS:Hindsite 6.0 for web cache cataloging is free-use.
0
jodygliddenCommented:
Well, it may very well be for a corporation although they offer better pricing for different groups like educational, shareware developers, etc.  It is the best at it though.  

For a good basic pdf search tool, blinkx isn't bad.  Desktop.google.com is free also and is going to support that format in addition to the ones that they do now as well very soon.
0
alikoankAuthor Commented:
thank you very much guys, I will give ISYS desktop and docsearcher a try on weekend.
0
rubiconxCommented:
There is an excellent utility that not only indexes PDF's, but many other types of files that you may have stored on your computer.

http://www.copernic.com/en/products/desktop-search/index.html

The Copernic Desktop Search tool is amazing.  Knocks spots off everything else out there - everything under $500 anyway!

Try it - it's FREE, and you won't be disappointed.

NOTE: The docs need to be available for the indexing to work.
0
turn123Commented:
Hi alikoank :-),
Since we haven't heard from you for a couple of days could you please give us an update on the status of this question?
See:  http://www.experts-exchange.com/help.jsp#hi51 Thank you, turn123's friendly update request script.
Offtopic comments about this script to http://www.experts-exchange.com/Applications/Q_21188389.html please :-).
0
alikoankAuthor Commented:
Hi,

I haven't found time to try programs but I think I will use ISYS desktop, till it expires at least.

thank you for your help.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Software

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.