I have been using a product called dtSearch Desktop with Spider for more than 20 years: https://www.dtsearch.com
It is an extraordinarily good piece of software! In the rest of this article, I'll refer to it as just dtSearch, for simplicity, except where I mention editions other than Desktop with Spider.
Disclaimer: Before going any further, I want to emphasize that I have no affiliation with dtSearch Corp. and no financial interest in it whatsoever. I am simply a happy user/customer.
dtSearch has two main components: an Index Manager that builds indexes and a Search engine that searches those indexes — very quickly! Here's the main Index Manager dialog:
dtSearch has a large number of indexing options, which you can get a sense of from the Preferences dialog:
dtSearch has built-in viewers for most common file types (PDF, of course — see below), but can also launch an external program automatically when the hit is on a file type for which it doesn't have a viewer. You can control whether or not the external viewer is launched on a case-by-case basis, that is, you can have different actions for each and every file type.
It has special handling for PDF files, allowing you either to view the PDF file in place (in dtSearch) or in a separate instance of Adobe Reader (and in both cases, hits are highlighted). Also, to improve performance, there's an option that lets you tell dtSearch to automatically open Adobe Reader for PDF files (the point is that Adobe Reader runs embedded in dtSearch and it opens PDF files much more quickly if Adobe Reader is already running separately when a PDF is opened in dtSearch).
Here's an example of search results. I made a test folder with some of my Experts Exchange articles in PDF format, indexed it, then did a search for "paperport":
As you can see, the top pane has header information on each document that matches the search criteria. The bottom pane shows the file that is selected in the top pane, with each search "hit" highlighted. The toolbar has user-friendly icons for navigation.
Speaking of PaperPort, I use dtSearch for indexing and searching rather than the All-in-One Search feature that is built into the latest versions of PaperPort. I even used dtSearch instead of the SimpleSearch feature built into earlier versions of PaperPort. In other words, I use PaperPort to create searchable PDF files, but use dtSearch, rather than PaperPort, to index and search those PDFs.
When dtSearch indexes documents that are mixed binary and text files (such as PDF Searchable Image files that have been created by scanning and OCR, as mentioned above with PaperPort), it has an option to filter out the binary. This makes the index much smaller than other products which also index the binary code (for no good reason). dtSearch has an interesting filtering algorithm that scans a binary file for anything that looks like text using multiple encoding detection methods. The algorithm detects sequences of text with different encodings or formats, and ignores the binary. This is perfect for PDF Searchable Image files created by OCR.
It has extensive search options, including stemming, phonic, fuzzy, wildcards (*, ?, and =), proximity (within 5 and within 25), synonym, any words, all words, Boolean, and exact/specific phrases. Here's the Search Request dialog:
As you can see in the Search dialog, it also has More Search Options:
Note, too, in the screenshot above that there is a Search History tab that retains your most recent 100 searches, making it extremely easy to re-run a previous, complex search.
Here's a link at the dtSearch site to a very interesting discussion of some features:
Indexing and Searching Features of Special Interest to Forensics Users
dtSearch utilizes the Windows Task Scheduler to update indexes. I currently have more than 50 indexes set up and configured it to update (a subset of) them every day in the wee hours. Of course, you may set it up to update the indexes as frequently/infrequently as you want, and you may specify which ones get updated — if some data is static, there's no need to update its index. You may have any number of indexes, each of which may index any number of folders/files, and searches may take place on one or more of the indexes. I often build an index on the fly for a folder/subfolders that I want to search – indexing is very fast (as is searching).
The capabilities go on and on, but at $199 USD, it is not an inexpensive product. Depends on how important search is to you. In my opinion, it is worth every penny — you are getting what you pay for. But if that's too much money, three good search tools for around $50 USD are Copernic, FileLocator Pro, and X1:
If you want a free product, Windows Search gets better with each release. It is built into Vista and all Windows releases after that. Also, the folks who make FileLocator Pro (Mythicsoft) offer a free "lite" version of it called Agent Ransack.
Note that the $199 price is specifically for dtSearch Desktop with Spider. Other editions, such as dtSearch Network with Spider and dtSearch Web with Spider, have different pricing, which you may see here: dtSearch
A final comment about the high initial cost of dtSearch. One positive point is their approach to technical support and product updates. Their online store page says, "Technical support and product updates are free for a minimum of one year with all purchases." The "minimum of one year" statement is vague and there is no fee mentioned, so I wrote to dtSearch Corp. asking for a clarification of the policy. Here's what they wrote back (with permission to share the answer publicly):
----- Begin dtSearch Corp. response -----
I appreciate your email, and sorry for the confusion!
Our setup licenses provide for a minimum of one year of support and upgrades on all licenses. That said, we have provided support and upgrades at no charge since Year 2000 for all end-user Desktop/Network licenses (!). Because of the higher average cost of developer support, we have been charging annually for developer (Web/Engine/Publish) upgrades and support, but again not Desktop/Network upgrades and support.
I can't always guarantee that this will be the case until the end of time, but that's why you don't find any "upgrade charge" indicators for Desktop/Network on our site currently.
----- End dtSearch Corp. response -----
Amortized over a large number of years for technical support and software upgrades/updates, the $199 USD license fee becomes much more reasonable. dtSearch Corp. was careful to say in the response that they "can't always guarantee" no upgrade charge, but during the 20+ years that I've been using dtSearch, I've received technical support and product upgrades on a continuous basis and have never paid anything beyond the initial license fee. So, it's a pretty good bet, if not a guarantee. By the way, as of this article's publication date, I'm running the latest release — Version 7.95, Build 8633, 64-bit, 24-Oct-2019 (although some of the screenshots in this article are from prior versions).
If you find this article to be helpful, please click the thumbs-up icon below. This lets me know what is valuable for EE members and provides direction for future articles. Thanks very much! Regards, Joe
Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.