Server losing network connection randomly - reboot required to reactivate.
Posted on 2009-05-13
Ok, we are having an issue with one of our new servers.
It is a DL380 G5 - 1 quad core CPU with 4GB memory running Windows Sever 2003 SP2. We built this machine specifically for one application within our company, which required SQL server 2005. The server has 2 NICs which we connected to 1 subnet - teamed.
After I had installed WinServer2k3 + sp2 we left the machine sitting there for a month or so and we had no problems - then sql 2005 sp2 was installed and again - 1 month with no problems.
Since then, we have installed the application it was designed to support which involves an amount of data being imported into the SQL database. Either during the actual import of data, the SQL verification or the SQL DB backup (as far as we can tell) we lose network connection to the machine completely and I have to connect via iLo2 to reboot to fix. This will generally occur once every day or so.
When I connect to the machine via iLo the nic appeared connected, but can not contact anything on the domain outside its own ip (10.22.20.x & 127.0.0.1). I tried restarting the Network service - but this fails.
To diagnose this, we have changed the IP, port on the router, network cables. We then broke the team, disabled one nic and ran off the other - the same occured - we reversed that situation and the issue still occured. We had HP replace the motherboard as the NICs are onboard and the issue still occurs. There is nothing in the event log at all or anything in the SQL Server log that points to anything other than the network connection dropped a number of users connections. Have also upgraded software nic drivers/firmware with no results as yet and really starting to run out of ideas.