I am working on a SharePoint farm with multiple app servers and multiple WFEs. One of my app servers is dedicated to search, and one of my WFEs is dedicated to just being crawled for content.
I have a number of web applications. One is a multi-tenant site. It isn't partition for search purposes, however, and has a single crawler that picks up all of the content.
It was setup so that the single content source for the crawl uses a start address of: https://localhost
. In the HOST file on the app server the search service runs on, localhost was pointed to the IP of the WFE set aside for crawling, instead of 127.0.0.1. So, when the search starts out on the app server it sees localhost, but is actually directed to the IP of the WFE to crawl based on the HOST file modification.
This recently caused problems with another service on that app server which needed the localhost entry to be the true 127.0.0.1 loopback.
I decided to change the address to resolve the other issue and to follow best practices, and figured a similar trick would work. I added an entry to the HOST file on the app server that pointed to the same IP. I made something up, calling it: searchcrawlwfe. I then changed the content source start address to point to https://searchcrawlwfe
instead of https://localhost
. This seemed to me to be a simple enough substitution.
Yet, the search crawler fails with error: "Access is denied. Verify that either the Default Content Access Account has access to this repository, or add a crawl rule to crawl this repository. If the repository being crawled is a SharePoint repository, verify that the account you are using has “Full Read” permissions on the SharePoint Web Application being crawled. (Error from SharePoint site: HttpStatusCode Unauthorized The request failed with HTTP status 401: Unauthorized.)"
Nothing else changed. It is still the same account that was crawling successfully before, same settings, same everything. I ensured the disableLoopback was set properly, which it was before. The ignore SSL was set to YES, as it was before. All I did was make the switch to the "start address."
Why would I get that error by moving from a crawler start address of: https://localhost
The HOST file was doing the same thing for both, as both localhost and searchcrawlwfe point to the same WFE IP. I tried a few other variations, such as pointing to: https://
<COMPUTER NAME> and even https://
<IP ADDRESS>. While all three of those resolved in a web browser to the same location in the WFE that https://localhost
did, neither of the three attempts worked for search crawler purposes.
How can I properly use a non-localhost start address for this content source?