"Service Unavailable" for all ASP pages, but all html pages OK (intermittent)

Greetings

We have a quite serious problem with our public webserver - it is a SBS 2003 without exchange and with SQL2000. This problem has been recurring intermittently for 6 months, but only around once per month for until this week when it has started to occur several times per day and therefore it has become serious.

The problem is that all browser clients get http 503 "Service Unavailable" error for all ASP pages. However, all html pages are served OK. Stopping and starting the WWW Publishing Service fixes this problem. There are no event log entries from the W3WP.

Thanks in advance
Rob
longrobAsked:
Who is Participating?
 
GranModCommented:
PAQed with points refunded (500)

GranMod
Community Support Moderator
0
 
thefritterfatboyCommented:
This has to do with your ApplicationPool crashing out. Have you setup auto recycle on the application pool?

If your server hosts multiple sites - put each site in it's own applicationPool and see which site seems to be causing the problem the most.
0
 
longrobAuthor Commented:
Thanks for the reply !

The server does host multiple sites, and I have put them into their own ApplicationPools. However, I don't know how to determine which one(s) are causing the problem.

The pools are recycling every 1740 minutes, which I believe is the default. And this is the only check box that is selected on the Recycling tab (Recycle worker process in minutes).
0
Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

 
thefritterfatboyCommented:
>>However, I don't know how to determine which one(s) are causing the problem.<<

Hopefully, only the sites affected will receive the "Service Unavailable" message in future.

The Windows Event Viewer should show where it all goes wrong. There is usually a few "Warnings" about the application pool forllowed by an "Error" stating that Windows has shut down the application pool due to multiple errors.
0
 
longrobAuthor Commented:
>Hopefully, only the sites affected will receive the "Service Unavailable" message in future.

Ahh, right ! :)

As for the Event Viewer, there is nothing at all  in either the System or Application logs concerning the application pools.....there are no entries at all from the W3WP and the only warnings and errors from ASP are eventID 5 and 9 (a function expected a parameter that was not provided; and an object of unknown datatype was encountered)

Thanks again !
0
 
longrobAuthor Commented:
OK, a new update. The problem occured just now. There are no event log entries concerning it before the problem occured. I have identified which website is causing it. When I stopped the www3p service I got this error in the system log
EventID 1013
A process serving application pool 'APMainSite' exceeded time limits during shut down. The process id was '4668'.

0
 
dnojcdCommented:
hope this article from MS can help you
http://support.microsoft.com/?id=821268
0
 
thefritterfatboyCommented:
Are you using ASP or ASP.Net? It sounds like a page has a problem in the code. (Infinite loop or too many redirects perhaps) When Windows tried to shut down the worker process, it waited for the thread executing to finish. When it didn't finish, it forced the process to shut down and wrote that entry to the event log.
0
 
longrobAuthor Commented:
It's legacy ASP.....the code has remained unchanged for almost 4 years. This problem only occured for the first time around 6 months ago and until this week it had only occured 5 times before. This week it has occured 4 times....I can't imagine that it would be a problem in code, and if it is I can't imagine how to track it down....... ?
0
 
thefritterfatboyCommented:
Are there any configuration differences between sites? If not, it's likely to be a code problem. (The problematic code may not have caused a problem before but Windows Updates have a habit of showing up problems that weren't problems before! Some people call this "breaking", but apprently that's not the case with MS! ;) )

I believe you can even split off your app pools on a folder-by-folder basis if not a file-by-file basis. (I'm not near an IIS 6.0 box to test this so I may be talking rubbish!) - If you can do it, try isolating scripts / folders to see which ones are causing problems.
0
 
longrobAuthor Commented:
No, there are no configuration differences between the sites, but the problem has just re-occured and this time it's on a different site in a different application pool......once again there is the 1013 event when the w3wp is stopped.  

0
 
thefritterfatboyCommented:
And there's no "EventID 1074" in your event viewer at all?
0
 
longrobAuthor Commented:
Right, there is not EventID 1074 at all
0
 
longrobAuthor Commented:
Well, this is very bizzare, but an EventID 1074 has just appeared in the System log . There was one previous, around a week ago.

But currently all sites are working fine......including the one in the app pool referred to in the event.

I believe I know what caused this event - I loaded a page on our site (code has not changed for a long time) and it simply timed out in the browser - but no error message, and when I reloaded it again it worked OK....This occured at the same time as the Event1074 as far as I can tell.
0
 
thefritterfatboyCommented:
Most script timeouts are handled by the recycler. You may have something a little more sinister in the script. The event 1074 is more of a warning. After a few of these Windows shuts down the pool and the 1013 kicks in.

We set our recycling up to recycle more often and we haven't seen the error since. (We're still looking for the culprit script or DLL.)
0
 
longrobAuthor Commented:
Hmm, that's very interesting - how often do you have the auto-recycle run ? Is there a downside to reducing the periodicity ?
0
 
thefritterfatboyCommented:
Have a look at my setup:

http://www.docupro.co.uk/permafiles/5F9BF4AE-3622-4BF3-BBB6-790D6A88D43D.gif

This is a pretty large site (nothing to do with the URL posted) so the memory usage is high.
0
 
longrobAuthor Commented:
OK, we have set all the app pools to recycle every 6 hours and put 384MB/128MB limits on the memory usage for the processes.

Now, after 3 days of having all the sites in their own app pools I have the following information to report....several of the app pools always crash when they are recycled (with eventid 1013) , and most of the do so occasionally. And most bizarre of all we have 2 sites that contain html only - one of them has only 1 very small page - and the app pools for each of them both crashed !! (I realise that we don't need app pools for them).

I have also noticed a lot of these events
The description for Event ID ( 54 ) in Source ( HTTP ) cannot be found. The local computer may not have the necessary registry information or message DLL files to display messages from a remote computer. You may be able to use the /AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part of the event: \Device\Http\AppPool.

According to this
http://www.webservertalk.com/message240325.html
it is not something to worry about, but the fact that it seems to be to do with application pools does concern me

AND, this weekend we have begun having problems of "insufficient resources" when opening IIS manager and other snap ins ! There is plenty of memory and cpu.

AND we are getting these errors occaisionally:
Microsoft OLE DB Provider for SQL Server error '80004005'
[DBNETLIB][ConnectionRead (recv()).]General network error. Check your network documentation.
/LM/W3SVC/406220273/Root/english/global.asa, line 84

That line of code is in session_onstart() and simply runs a very inoccuous stored procedure. The database is on the same machine, so it's hardly likely to be a network error....

I realise that the last 2 items are not IIS issues, but I can't help thinking that it could be related

0
 
thefritterfatboyCommented:
>>General network error. Check your network documentation. <<

That can usually be due to an incorrect DSN setting. If your connection is a DSN connection, try making it DSNless. If your connection is DSNless - make it reference the server using "." rather than the server name / IP address.

>>AND, this weekend we have begun having problems of "insufficient resources" when opening IIS manager and other snap ins ! There is plenty of memory and cpu<<

Definately smells of bad code to me. Either objects not being destroyed or a few infinite loops. (missing recordset.movenexts / datareader.reads)

Again - the EventID 54 says that a HTTP thread is not responding when Windows wants it to shut down. It smells like an infinite loop again.
0
 
longrobAuthor Commented:
Can problematic code in one AppPool cause a different AppPool to crash ? We have App Pools containing only html that are crashing, as well as one that contains only 1 ASP program and that program has no almost no conditional statements and is run tens of thousands of times every day (but is crashing only once or twice per day) .  We have some big sites that have heaps of code, and to be honest I find it highly improbably that even those could have code that executes so very rarely it could cause these intermittent problems.
0
 
thefritterfatboyCommented:
>>We have App Pools containing only html that are crashing<<

I'm sorry, I was under the impression that your HTML pages were unaffected. Have things changed? Are your HTML pages now not serving?
0
 
longrobAuthor Commented:
Sorry for the confusion. Originally all the html pages were working and all the asp pages were not, when the problem was occuring - that was with all sites in 1 app pool. But we have now set up lots of app pools for diferent sites and also some app pools for particular parts of certain sites. By accident we made 2 app pools for sites that are html-only, and both of those app pools also crashed - maybe crashed isnt the right word - but we get event 1013's when they those app pools recycle although the pages are still served OK.
0
 
thefritterfatboyCommented:
I'm afraid I'm all out of ideas. Not sure how the App Pool can crash on HTML-only sites. I wonder what would happen if recycling was turned off altogether?

(Maybe it'd run CPU + memory through the roof so may not be wise on a production server!)
0
 
longrobAuthor Commented:
Well thanks for your advice anyway - I'll be sure to post follow up if I make some progress. At this point I'm considering reinstalling windows if we can't get some improvement soon - I'd happily go back to the situation of this occuring only once a month !
0
 
longrobAuthor Commented:
Latest update - all our app pools go down occaisionally, including the ones containing html only. it's occuring maybe once per day now.

Interestingly, the MSSharePointAppPool goes down as well, which it never used to do, and we have never touched - it contains 3 sites - <SPS Admin>, <Admin Home>/vti_bin and <websitename>/_vti_bin
0
 
longrobAuthor Commented:
We are also still getting these errors intermittently
Microsoft OLE DB Provider for SQL Server error '80004005'
[DBNETLIB][ConnectionRead (recv()).]General network error. Check your network documentation.
The connection is DSNless and we've tried referrring to the server by IP address and by "."
0
 
longrobAuthor Commented:
Sorry for all the postings, but we are also now getting lost of these errors in the system log
EventID1120 Source W3SVC
The World Wide Web Publishing Service failed to obtain cache counters from HTTP.SYS.  The reported performance counters do not include performance counters from HTTP.SYS for this gathering.  The data field contains the error number.
0
 
longrobAuthor Commented:
and we are also now getting lots of these errors, despite there being lots of RAM and CPU
EventID 10000 source DCOM
Unable to start a DCOM Server: {73E709EA-5D93-4B2E-BBB0-99B7938DA9E4}. The error:
"Insufficient system resources exist to complete the requested service. "
Happened while starting this command:
C:\WINDOWS\system32\wbem\wmiprvse.exe -Embedding
0
 
longrobAuthor Commented:
Hello !

An update:

We installed SP1 for SBS on friday, and when it rebooted we got a blue screen of death (for the first time ever). To cut a long story short we had to rebuild the server and restore from backups.

The problem is no longer occuring.

A rather drastic solution, and one that was forced on us, but that's what happened.

Regards
Rob
0
 
thefritterfatboyCommented:
Ouch, painful!

Fingers crossed the problem doesn't rear up again!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.