We help IT Professionals succeed at work.
Get Started

Indexing Service crawling slowly through PDFs

558 Views
Last Modified: 2012-06-27
I have a problem with Indexing Service under Windows 2003 Standard working very, very slowly. Queries of the catalog are made with a browser interface using ASP scripted pages.

The files being indexed are all Acrobat (PDF) files, and the Indexing Service has PDF iFilter 6.0 installed. For quite some time, the system worked fine. There were separate catalogs for each of a growing number of directories (each of which holds between 2,500 and 15,000 or so documents, in various subdirectories). Each of the main directories is usually "completed" as far as new documents go within a fixed period of time, then it remains static, so I didn't notice a problem with speed until recently, with the most recently created catalog. Normally, the Indexing Service could read and process a few hundred documents in a minute or less.

I had posted a handful of new documents to the most recent catalog and conducted a search later that day, and the document for which I was searching and which I knew contained the matching term did not list. I checked Indexing Service and it showed a few thousand files left to index; watching it off and on for half an hour, it only processed about ten of those.

Research showed that having multiple catalogs can slow down the service, especially on older versions (but not Server 2003); nonetheless, I redesigned the indexing to create one catalog for the parent directory which holds the document directories (each of which previously had its own catalog). I stopped Indexing Service, deleted the individual catalogs, restarted the server, and made sure Indexing Service restarted. It did, but it's just as slow slogging through the files to index.

On a whim, I started over, deleting the catalog again, and tried creating a catalog on just one of the oldest subdirectories, one which I knew had processed easily originally and which, with about 10,000 documents, should have taken no more than an hour to catalog. It's been over an hour and it's processed only about 125 of those documents. So it's not a question of one document choking it.

I've even tried adjusting the "tuning" settings to "Instant  Indexing" and "Low Load" Querying - none of which seems to be affecting the speed.

Can anyone suggest a process, etc. that might be causing the Indexing Service to slow down so dramatically? No other services on the server seem to be a problem.
Comment
Watch Question
Top Expert 2015
Commented:
This problem has been solved!
Unlock 1 Answer and 7 Comments.
See Answer
Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

  • Troubleshooting
  • Research
  • Professional Opinions
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE