Link to home
Start Free TrialLog in
Avatar of Gunwant Saini
Gunwant SainiFlag for India

asked on

IIS Pool Crashing

Hello Experts,

We are facing a peculiar problem related with IIS.

First let me explain the scenario.

We have hosted a ASP.Net site using .Net 4 on IIS 8 on Windows Server 2012 with SQL Server 2012 as DB.

The Website/Portal is hosted on 4 servers using a hardware load balancer.

Its a simple website asking for information from user and filling a form, upload some documents including photo/image and saved the same in DB as bytearray.

Its working properly, but sometimes, for past few days, we don't know, whether, its related with code or what, IIS App Pool is crashing automatically. We have verified the code and its working properly, but still we are unable to diagnose what/where exactly the problem is.

After restarting the IIS using iisreset or recycling the pool, resets the WebSite and then its working properly.

We have tried using debugdiag, windbg, but unable to diagnose/identify the root cause of the problem causing the crash.

The crash occurs randomly.

In local Development Server, the website is working properly, but on production machines, its causing the problem on random intervals.

Please help and suggest.

Thanks
Avatar of Patrick Bogers
Patrick Bogers
Flag of Netherlands image

Whats in event viewer when this occurs?

Did you creatie Unique application pools or all running the default?
Avatar of Gunwant Saini

ASKER

Hello,

Actually the Website/portal is running in its own Unique Pool in InProc Mode by default.

I also viewed the Event Logs at the time of the crash and the entries logged in it are as

"A worker process '3324' serving application pool 'DefaultAppPool' failed to stop a listener channel for protocol 'http' in the allotted time.  The data field contains the error number." with EventID : 5138

And then this one.

"A process serving application pool 'DefaultAppPool' exceeded time limits during shut down. The process id was '3324'." with EventID : 5013

Actually, the Error Number in the Data field is also not present.

I also tried DebugDiag and Windbg, but unable to extract meaningful information for the crash,

I know, that the problem is in the Code, but unable to find out the problematic code for the crash.

Please help.
Hi

The errors are concerning the DefaultAppPool which is not best practice.
If this is by choice please check if the defaultapp is configured for using .NET4
(in IIS7 it is standardly configured to use .NET2.0)

Better practive would be to create a new application pool which is named after the website to recognition and troubleshooting.
For each application under this website you would create another app pool.

Example:
www.website.com would use app pool    www.website.com created for .NET4
www.website.com\portal would use app pool www.website.com.portal created for .NET4
www.website.com\portal\V2 would use app pool www.website.com.portal.V2 created for .NET4

If you keep to this principle you see it becomes a lot easier to troubleshoot code-faillure.

BTW: How did you configure DebugDiag? crashonly or did you create rules?
Hi,

Correct, we have verified the same, its for .Net 4. and after the installation of the IIS on the Windows Server 2012, its by default is configured for .Net 4 as per my understanding.

We have created a separate application pool per website, but the problem is that, this "DefaultAppPool" pool is crashing frequently.

We have tried going through logs, but unable to identify the root cause.

Sometimes, the Pool gives a 503 error, sometimes, it gets hanged i.e. the site becomes unresponsive.

Thanks for the same.
Hi again,

Ok, in IIS8 standard pools are created for .NET4, lesson learned.

So DefaultAppPool is bound to a website, in IIS-> application pools you can see which to website it is correct?

How did you configure DebugDiag? crashonly or did you create rules?
Hi,

Only 1 Website is configured in the DefaultAppPool.

in DebugDiag, i created rules to identify the crashes due to which DebugDiag, created lots of files in the folder configured which results in nothing & small files of around 250MB, which i don't think include the or are the crash dumps related with IIS.

Please let us know, how to track IIS App Pool Crash/Hang/Site becomes unresponsive through DebugDiag, which will then result in the proper analysis of the problem.
Hi,

Also listing some of the Event Viewer Logs as logged on production servers.

1.      A process serving application pool 'DefaultAppPool' suffered a fatal communication error with the Windows Process Activation Service. The process id was '1064'. The data field contains the error number. : Event ID 5011
2.      Application pool 'DefaultAppPool' is being automatically disabled due to a series of failures in the process(es) serving that application pool. : Event ID 5002
3.      A process serving application pool 'DefaultAppPool' failed to respond to a ping. The process id was '9120'. : Event ID 5010
4.      A process serving application pool 'DefaultAppPool' exceeded time limits during shut down. The process id was '1880'. : Event ID 5013
5.      A worker process '1880' serving application pool 'DefaultAppPool' failed to stop a listener channel for protocol 'http' in the allotted time.  The data field contains the error number. : Event ID 5138
Hi

The steps Microsoft provides to create rules for- and create a dump file is documented in this Microsoft link.
Sir,

I have done all of that, gone through all the links but still don't know what exactly i am missing. I am unable to accurately identify the root cause related with the Pool Crash and Hang.

Please suggest.
Sir,

Please let me know, why the issue resolution/identification is so complex related with IIS, because now a days, its crashing frequently. I am unable to proceed for resolving the issue.

Please help.
Hi again,

I know troubleshooting application pools can be hard, therefor it is best practice not to use the DefaultAppPool but create a unique application pool for each website and applications below it. This way you can examine the logs more accurate.

So find out which site/application is using the DefaultAppPool and give them a unique pool.
Sir,

Please believe me, i have done that too, Also attached is the screenshot for your reference.
AppPool.png
Hi

Under DefaultAppPool it says number is 1, which application is it bound to?
Sir,

The one which is crashing, where the problem is & the one which i am trying to debug for the Pool Crash.

Thanks for your time.
Hi

I had to re-read this thread and now i wonder, does this happen on all 4 servers or just one?
Sir,

all of them, randomly, all of a sudden, on Server1, Server2, Server3, or Server4.

Then we either Recycle the App Pool or execute iisreset, which will then get the site running normally.
Hi again,

Ok so it is a application issue for sure, can you export some errors from event viewer around the last time this happenend and post here?

If you are able i would also see the IIS website logs around that time to check wether it might be one special call that is failing.

Offcourse you could obfuscate private info from the logs.
Sir,

The logs from the Event Viewer are already posted above. There's no private info, there, i am desperately dying to resolve this issue due to which i am unable to resolve the issue which may or may not occurring in Code.

here are some of the logs as

1.      A process serving application pool 'DefaultAppPool' suffered a fatal communication error with the Windows Process Activation Service. The process id was '1064'. The data field contains the error number. : Event ID 5011
2.      Application pool 'DefaultAppPool' is being automatically disabled due to a series of failures in the process(es) serving that application pool. : Event ID 5002
3.      A process serving application pool 'DefaultAppPool' failed to respond to a ping. The process id was '9120'. : Event ID 5010
4.      A process serving application pool 'DefaultAppPool' exceeded time limits during shut down. The process id was '1880'. : Event ID 5013
5.      A worker process '1880' serving application pool 'DefaultAppPool' failed to stop a listener channel for protocol 'http' in the allotted time.  The data field contains the error number. : Event ID 5138
Ok,

One thing you can try is ADplus. How to get and how to work with is described here.

Also, are you able to show the outcome from windbg?
Sir,

i too tried that and installed everything & after all taking all the steps, i have posted here, actually, what happens, sometimes, the whole of IIS goes down, the pool is working but the site is not responding, i.e. it becomes unresponsive, then ADPlus results into nothing. Though, it created dumps, but those does not provide information related to the CLR Stack etc.
So if you provide those dumps to windbg, isnt there any lead?

Next, are there signs of memory leaks through the error logs? (beside from what you posted here?)
Hi,
Currently, i deleted the file, where i have written the call stack from the memory dumps, so , i don't have that. i have to regenerate the same.

BTW, how to identify the Memory leaks through error logs, i have gone through Google, but unable to come across the identification of memory leaks.

Please suggest.
Hi

You would filter on type of service like e.g. W3SVC and find entries like this

Event Type: Information
Event Source: W3SVC
Event Category: None
Event ID: 1077
Date: Date
Time: Time
User: N/A
Computer: ComputerName
Description:
 A worker process with process id of '1234' serving application pool 'DefaultAppPool' has requested a recycle because it reached its virtual memory limit.

Did you have your website validated? Perhaps this link to validator.w3.org can help if there is a Obvious mistake within the code.
Sir,

I have verified the Event ID 1077 from Event Viewer on one of the servers, there is no such log and even the text "its virtual memory limit" too is not logged.

What it signifies?

Though, we have not validated the site, but the issues are not cropping up on development and test server.

Thanks
Hi

No memory leak is good. It points me into the direction of wicked SQL queries since test is not showing any signs of bad coding it is also not heavenly tested or did you actually performed a real life like load test? ( i guess not so you havent seen DOS due to bad SQL queries)
In other words, if one query take 2 seconds to execute and it is kicked off 3x per second, what will eventually happen??

Two things you can do, Have a tool like WAPT simulate the same ammount of users you serve in production (one one server that is) in the test enviroment and monitor its SQL behaviour (run a real live trace with activity monitor) *pretty sure it will break*

Second, monitor your productiondatabase and look for execution plans which are way to expensive and also monitor SQL server's activity monitor.

For me the (for now) conclusion is you have a website which under load creates a DOS attack on your own database under which it cracks.
Hello Sir,

Agree for the past 2/3 days, we are spending time related with optimizing the SQL Queries, and we have improved the performance. After that, don't know, all of of a sudden, the crashes are occurring less as compared to previously.

But still, i have tried certain steps using cdb.exe to generate IIS Hang dumps, though the dumps are created, but, then how to understand/identify the block of code, where the problem is.

I have opened the dumps through windbg & all the dumps are referring to ntdll.dll file, where the crash/issue is occurring. Now i am confused, as whether the problem is in IIS or in ntdll.dll i.e. Windows OS.

Thanks for your time.
Hi again,

So after profiling performance becomes quicker and more steady, you are on the right trail.
About ntdll.dll i tend to not to worry as it is a user dll which tend to generate a lot of false positives. In cases it was correctly complaining i noticed out of memory errors.

Summing up, if IIS is generating to much noise on the database then there is one way, profile profile profile until you sanitised all roque queries.

If you now monitor SQL usage in activity monitor it will show you the most expensive queries from which you can determine what application is doing this.
Hello Sir,

BTW, please let me know, the steps which we can follow to exactly identify the cause of the problem, even if, we think, that the queries executing are being optimized on DB Server.

How to identify what is slowing down the response of the pages on IIS even after applying all the configurations

I mean, how to identify the function, Call Stack, why the page is slow? etc.

I am done with DebugDiag, windbg, ADPlus as it requires its own expertise.

Please suggest.
Avatar of mwenenko
mwenenko

My gut feeling is that issue is not at your IIS side.. your SQL transactions are taking longer time. Please check SQL logs, locks etc.
ASKER CERTIFIED SOLUTION
Avatar of Gunwant Saini
Gunwant Saini
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
We have used various methods as code complexity, rectifying the logic to resolve the Pool Crash. Its also somewhat related to DB.