Link to home
Start Free TrialLog in
Avatar of matrix0511
matrix0511

asked on

Having Major Problems Starting Webphere Services

We have two web servers: CTH-OWHS4 & CTH-OWHS1. Both have a single production web server instance that runs on each server. CTH-OWHS4 ONLY HAS ONE PRODUCTION INSTANCE.

However, CTH-OWHS1 has a production instance and two development instances on it. Yesterday, we stopped and started the two development instances. But when we went to login to the app via the web page it gives a bunch of Websphere errors. See attachements.

We use the JD Edwards application for our web access. And use Websphere 6.0.2.21 for our Web server engine.

We run them on Windows 2003 servers.

we have IBM HTTP Server installed on both servers.

See attached logs as well. Any  help would be greatly appreciated. Right now the plan is to reboot CTH-OWHS1. But we are all worried that after rebooting that the PD instance might not come back up just like the other two. if its just a memory issue, or something hung up then a reboot should fix it. but if there really is a file missing or corrupted, we will have serious problems. this is a 24/7 shop and so any kind of downtime is bad.
JAS-ISSUE.zip
9-3-2010-9-32-43-AM-screen-print.jpg
9-3-2010-9-35-18-AM-WAS-error.jpg
native-stderr.log
Avatar of allen-davis
allen-davis

Has anything recently changed on the development instances?  Just from your stack traces, it looks like maybe a) the java process has not shutdown cleanly due to the server start logs saying that an instance may already be running on the specificed port and/or b) the startup arguments or classpath of the node has been altered or c) the deployment files for the application in your dev instance have maybe been altered or deleted.
I would try this:
1) compare the startup settings and classpath for the development nodes to each other and production and make sure they all match if they're expected to match.
2) do a netstat on the box and see if the server is listening on the port for the nodes that you thought were in a down state.  Maybe the java process is just still running but in a 'zombie' state.
3) confirm that the deployment files are in the folders where you expect them to be and have the right file permissions.
My *opinion* is that I agee it would not be a good idea to restart anything else until you can determine and resolve what is wrong with the dev instances.
Has the java runtime automatically upgraded, it appears that the upgrade (or version change) has cause the appplication to give classdef error.
below servlet dependent class missing in the path. seems below classes depdedent jar missing/removed/corrupted

com.jdedwards.runtime.virtual.servlet.loginservlet
Avatar of matrix0511

ASKER

Question for you guys. You know how each WAS sevice has it's own "java.exe" service that runs in Task manager? Well for me I have like 5 or so because I have 3 different instances running. two services for developemtn and one for production.


Well, i agree I suspect that one or two of those java processes are zombie or hung. But how can I tell which one of those processes are which? Like how to tell which one is for development and which is for production? Is there a command that I can run to tell?

If so, I could then just kill that specific process right??
9-7-2010-8-11-16-AM.jpg
ASKER CERTIFIED SOLUTION
Avatar of calboronster
calboronster
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Oh. I didn't know that the "serverstatus" command shows PID info. Great. Thanks!
allen-davis, you ask some really good questions. But since I'm not a WAS expert I'm not 100% sure how to check some of the things you suggest.

See my questions for each of your suggestions below.


1) compare the startup settings and classpath for the development nodes to each other and production and make sure they all match if they're expected to match.
Where do I go to check the startup and classpath settings?

2) do a netstat on the box and see if the server is listening on the port for the nodes that you thought were in a down state.  Maybe the java process is just still running but in a 'zombie' state.
Again, what specific netstat command should I use and where can i verify if the port that shows matches the correct port of the node? And what kind of port is that? is it the SOAP port defined in the WAS xml file??

3) confirm that the deployment files are in the folders where you expect them to be and have the right file permissions.
What deployment files?


btw, this issue was resolved when I removed the entire DEV instance from the Windows services app. Then I reinstalled the service from command line, then start the services from the command line. once that works for the users i then go back and add the service back to the Windows services app. However, yesterday it broke again. users got same exact errors when they tried to pullup the application from the web browser. I removed the service again as I mentioned and it started working again.

What is going on here??? The only consistant thing here is the fact that it gets eventually resolved by removing and adding back. But i have to remove it and add it back from the command line first. then once I confirm it works again, I add it back to the services app. that seems to be the only ssquence that works. if I remove it and then add the service back to Windows app BEFORE starting the app services from the command line, the service will still start but users get errors. But if I start from command line first seems to do better.

somethign is out of synch here.