Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

offline index / search application for pdf document cd-rom's

Posted on 2004-11-18
16
Medium Priority
?
595 Views
Last Modified: 2008-02-01
Hi experts,

I have a few cd's contains e-books, datasheets and various documents  in pdf format. I am looking for an application that can index and perform search on the contents of indexed files.

This program should have the ability to:
. create full text search index from pdf files
. work offline, without inserting the library cd

additional formats beyond pdf or reading into archive files will be nice but pdf is a must. I am not interested in mp3 / audio cd catalouge programs, please exclude them from your recommendations :-)

program may  be freeware or shareware.

If you know of such a program please let me know.

thanks
0
Comment
Question by:alikoank
  • 5
  • 4
  • 3
  • +3
16 Comments
 
LVL 93

Expert Comment

by:nobus
ID: 12614691
0
 
LVL 93

Expert Comment

by:nobus
ID: 12614715
0
 
LVL 4

Author Comment

by:alikoank
ID: 12614932
Hi nobus, thanks for reply

I actually stumbled across the first url when I did a search on EE, it contains many programs,I was rather hoping a personal opinion/recommendation from someone who had actually used a product like this.

I take look at the second URL, Jaws pdf creator is just an editor, it does not create offline index files.
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 

Expert Comment

by:mchorgh
ID: 12615370
This may not answer your question directly but may address your issue. If you are creating a solution for a client you may disregard - my response relates to personal archives.

I have numerous electronic documents in my library and have struggled with many methods to recall particular info. Most documents are not created equally, in that the metadata is not properly defined. If you want a full text search of PDFs, you might as well store the document online. My suggestion is more of a direction to go and encompass all file types.

1. I would spend a little time upfront to properly name the document. For example: <Author><Title><Description><Publisher><Pub. Date>.<Extension>.
2. Write to CD to archive. Name Archives consistently and with date, e.g., Fiction - 2/5/2005.
3. Index the Archive manually using WSH: dir /b /s >> "c:\Archives Index\Fiction - 2-5-2005.txt". Then store labeled Archive offline in chronological order.
4. Index with Google Desktop (free) or the standard Windows indexer.

Whatever method you use, remember to keep it consistent.

Regards,



[conceptually try
0
 
LVL 4

Author Comment

by:alikoank
ID: 12615809
Hi mchorgh, thanks for your reply,

I am looking for a solution for my own library, I am open to suggestions.

My library consists of literally thousands of small files, some of them are named in a way you proposed but renaming all to a standart is practically impossible, that is one of the reasons I want an application to do the indexing.

Google desktop unfortunately does not index pdf files. I found out that Adobe has a plugin for Microsoft Indexing Service but I am not sure about its performance, how can I backup or move index database or how will it cope with CD drives (offline content).
0
 
LVL 1

Accepted Solution

by:
jodyglidden earned 960 total points
ID: 12615886
This product should do it for you.  I use it at school.  It will jump to the exact part in the pdf that contains the text strings as well which is handy for large pdf files.

http://www.isysusa.com/products/desktop/
0
 
LVL 4

Author Comment

by:alikoank
ID: 12616194
Wow, that really looks like good. Just one thing, does the original file have to be present for it to work? I mean to use it for CD-ROM's and there was no explicit informaton on the website about this subject.

thanks.
0
 
LVL 1

Expert Comment

by:jodyglidden
ID: 12617245
Yes, if you want to be able to link to the data.  I don't think you'll find one where the file doesn't have to be there in order to see the info just because of the nature of storage.   Good luck
0
 

Assisted Solution

by:mchorgh
mchorgh earned 240 total points
ID: 12617701
Thanks.

ISYS is an enterprise product and its cost may be prohibitive for personal use.

Try http://www.docsearcher.tk/. It is free-use software in java, so you will need Sun's Java VM.

=========================

If the above did not help, I think you will find that you will have to bite the bullet on all those back files. Properly name the 10% of the files you use regularly; use sort by file accessed in Windows Explorer, and maintain the naming paradigm for subsequent files. Achieving perfection will be futile, just maximise your efforts. I literally have ten of thousands of files and renamed groups in batches a few years ago. I use a free-use software, Scarabée Software - Siren, for renaming. It is superb and can capture some useful metadata from the document itself. For example, you can group docs in a folder titled "Taxonomy & Classification," and rename the contents inserting the name of the parent folder within the original filename.

Full text indexing of PDF docs will be CPU and time intensive as the files are images and will require conversion to text first. Google online full text indexes Internet available PDF docs. If you have freely available docs, use Google online first for your full text searching needs.

I think that developing and adhering to a standard is easy relative to the capturing of metadata and renaming if you so choose. Google Desktop will index the names within the text file (from the dir command). I also suggest if you do not know a particular metadata, maintain its position of use a placeholder. For example, from the previous post, if you do not know the Author use "NA" then delimit. There will come a time when someone will create a cost effective solution and you will need to convert that flat file into that database. Properly (to your needs) naming the docs will also help when you process and access information through more structured means, i.e. sytematics (where you don't know of the information), as opposed to knowledge based (where you know of the information).

You are correct the Google Desktop Beta does not index PDF; it is my understanding that they are working on this feature. Also Microsoft through MSN will release a similar search/indexer very soon.

I am aware of other solutions that are a tad more complexed to setup, easy and free to use, and a few that are enterprise robust and costly.

Regards,


0
 
LVL 1

Expert Comment

by:jodyglidden
ID: 12617965
Actually, ISYS has many versions and the one that I mentioned ISYS desktop is not enterprise.   ISYS desktop is perfect for what you're doing.
0
 

Expert Comment

by:mchorgh
ID: 12618113
I stand corrected, if it is so. Thanks.

However, if my recollection serves me well, ISYS Desktop 6.0 retails just under $600. I realize that the actual purchase price could be less. ISYS:Hindsite 6.0 for web cache cataloging is free-use.
0
 
LVL 1

Expert Comment

by:jodyglidden
ID: 12618174
Well, it may very well be for a corporation although they offer better pricing for different groups like educational, shareware developers, etc.  It is the best at it though.  

For a good basic pdf search tool, blinkx isn't bad.  Desktop.google.com is free also and is going to support that format in addition to the ones that they do now as well very soon.
0
 
LVL 4

Author Comment

by:alikoank
ID: 12622515
thank you very much guys, I will give ISYS desktop and docsearcher a try on weekend.
0
 
LVL 4

Expert Comment

by:rubiconx
ID: 12624894
There is an excellent utility that not only indexes PDF's, but many other types of files that you may have stored on your computer.

http://www.copernic.com/en/products/desktop-search/index.html

The Copernic Desktop Search tool is amazing.  Knocks spots off everything else out there - everything under $500 anyway!

Try it - it's FREE, and you won't be disappointed.

NOTE: The docs need to be available for the indexing to work.
0
 
LVL 11

Expert Comment

by:turn123
ID: 12639592
Hi alikoank :-),
Since we haven't heard from you for a couple of days could you please give us an update on the status of this question?
See:  http://www.experts-exchange.com/help.jsp#hi51 Thank you, turn123's friendly update request script.
Offtopic comments about this script to http://www.experts-exchange.com/Applications/Q_21188389.html please :-).
0
 
LVL 4

Author Comment

by:alikoank
ID: 12653539
Hi,

I haven't found time to try programs but I think I will use ISYS desktop, till it expires at least.

thank you for your help.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

There are literally thousands of Exchange recovery applications out there. So how do you end up picking one that’s ideal for your business & purpose? By carefully scouting the product’s features, the benefits it offers you, & reading ample reviews f…
This applies to Dell but may also apply to other manufacturers as well. We ran across a few machines that just dropped recently it trust relationship with the server. After doing the basic removing and joining the domain again, it changed to No logo…
The viewer will learn how to successfully create a multiboot device using the SARDU utility on Windows 7. Start the SARDU utility: Change the image directory to wherever you store your ISOs, this will prevent you from having 2 copies of an ISO wit…
An overview on how to enroll an hourly employee into the employee database and how to give them access into the clock in terminal.
Suggested Courses

577 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question