We've been having an issue with our database for the past week that has been making our sharepoint site virtually unavailable around the same time every night. I've exhausted everything I can think of that would be causing this, so I turn to you for help.
We're running a small sharepoint server farm, 3 servers. One MOSS 2007 Front-end, one SQL Server (2005 enterprise), and a third server to handle crawls and service searches. These are all VM's run through VMWare. All three servers are configured pretty much the same, with a few minor changes tailored to each servers role. The basics are this: 4 vCPU's, 16GB of RAM (except the crawl/search server which has 6), and Windows Server 2003 sp2 enterprise.
First, the we receive the error:
l schedulers on Node 0 appear deadlocked due to a large number of worker threads waiting on LCK_M_S. Process Utilization 1%.
This goes on for anywhere between 3-7 hours, until we get this error:
he instance of the SQL Server Database Engine cannot obtain a LOCK resource at this time. Rerun your statement when there are fewer active users. Ask the database administrator to check the lock and memory configuration for this instance<c/> or to check for long-running transactions.
at which time the servers are restarted by our night crew. When the server comes back up it goes through the CHECKDB for the databases and begins a recovery of our content database.
Attached is a trace log from Sql Server Profiler. I'm baffled and don't know where to go from here. We've completely turned off the crawl/search functions, ran Database Engine Tuning Analyzer, set a memory limit (10GB) for the SQL engine, and anything else we could think of or could find online to no avail. Any ideas?