Link to home
Start Free TrialLog in
Avatar of hongedit
hongeditFlag for United Kingdom of Great Britain and Northern Ireland

asked on

Network issue - very very strange

Hi all

Today all of a sudden, there are about 4 servers out of 20+ that are refusing inbound connections.

From the problem machines, you can ping anything and everything. Internet works.

Apart from that, nothing can communicate with these servers! Cant ping, cant browse, cant RDP. Acting very much like a firewall issue, except there is no firewall on these servers. Even uninstalled the AV to be sure.

Have done scans with Malwarebytes and Trend Online.

Looked through NIC properties, all looks ok. Changed switch ports, etc.

They are connected to a L3 managed switch and I can see the port that they are plugged into has RX/TX traffic as expected. TX is a bit on the high side in my opinion.

Free free to request any specific info.

Any ideas? I'm out of them.
Avatar of hongedit
hongedit
Flag of United Kingdom of Great Britain and Northern Ireland image

ASKER

Have updated NIC drivers also
What about IP configurations of the said servers compared to other servers. What about the network configuration of the network?
Avatar of cavp76
cavp76

Event Viewer, anything strange there? Already rebooted the servers? What do they host/share?
Might have a look at the destination on that outbound traffic also.

Are you able to take those four boxes offline temporarily?

What OS?

What server roles?

What antivirus?

Did the firewall mysteriously start up?
Avatar of Steven Carnahan
Don't forget to check the L3 switch for possible config changes. Look at it for traffic to/from those servers.
have any new windows updates or patches been installed recently?
Some more background:

The servers with this issue is in Liverpool. They are connected via MPLS to the other sites, one of which is London.

We did an office move in London at the weekend (merging 2 offices) so some re-routing was done on the switches but nothing on the Liverpool side - everything there is the same as it always has been.

The servers are running Windows Server 2008. All host different things - File, Print, Citrix, Exchange, DFS, etc.

L3 switch config has not been changed in some time. AV is FortiClient (was McAfee).

No updates recently.

Netstat shows not a lot of incoming traffic. Is there a tool to see where outbound connections are going to (IP, port)?
Experiencing some very strange issues also at the London site, which although are un-linked on face value may somehow be contributing to or affecting Liverpool.

There is a big ? over the integrity of the core switches in London (HP Procurves), there was/is a virus outbreak in London, and certain servers/Group Policy is acting up also...

First I would really check out the equipment in London.

Second, did you outside IP address change for the servers in London after the move? It could be that the routes in the other locations need to be updated with the new IP.
No ip changes in London.

The office that they moved out of is still hosting the internet connection for the essential stuff (VPNs etc). They are connected via a LES so as far as the network is concerned its just Layer 2, no different to before.

Using TCPview (sysinternals) this server is just sending out local traffic. Nothing external. Wierd.

Are the servers unreachable by name, IP address, or both?
If name only, check the DNS records for possible corruption and/or bad information.

hth!
:)
unreachable by name or IP.

Not even pingable by the switch it is connected to.

When I ping the server by its own hostname I get a replies from ::1: - IPv6 is off, does this have any significance?
Can the 4 servers see each other or a workstation in the same physical location see those servers?

It sounds like the switch and the NIC are not communicating or they are using DHCP and are not getting an IP address.

Try running ipconfig /all on one of the servers and confirm it is has an IP address.
Yep, all have static ip addresses.

Problem exists on local physical network.

I have asked someone to cross cable into the server NIC to see if it is pingable that way, to rule out the rest of the network.

Cannot ping each other through 3 different dumb L2 switches...think its safe to say this is not the network, its something on the servers!
It does sound that way.  

You mentioned updating NIC drivers.  Did you do that to all the affected servers durring the move?

paulms53: mentioned Windows updates.  Were any performed durring the move?

Was there any other significant change in hardware/software that took place at the same time as the move?
Hi

All NIC drivers have been updated on the problem servers. The servers in Liverpool which have this issue arent really directly related to the move - to them, nothing has changed.

I checked Windows updates and the only thing to have happened on the Friday was an install of Windows Defender definition updates.

No changes of hardware of any kind.
Have someone plug a screen and a keyboard to the affected systems... what does the screen show?
When you ping one of those servers from itself, what kind of response times are you getting?

This may seem an odd question, so let me back up and provide some context.  I've been trying to Google your "::1:" results (not an easy task; Google doesn't much like those symbols!), and as you suspected almost everything I've found is pointing to IPv6.  But several of the pages I found were message board posts where someone was getting wildly erratic or even negative ping responses, which symptom was fixed by adding "/usepmtimer" to the boot.ini file.  I don't think that's likely to be your problem, but I figured it was worth mentioning.  ;)

Assuming the above tangent doesn't prove to be a solution for you, I would try this:
1.  Enable IPv6 on one or more of the affected servers, see if that fixes the problem.
2.  If that doesn't work, turn off IPv6 again, and consult this article to ensure that IPv6 is disabled.

I have also seen some rumors that Exchange 2007 needs IPv6, so that's something to consider as well.

Hope this dissertation yields some useful information ...
;)
<1ms is all the replies.

Tried enabling IPv6, I also tried disabling via registry also (and rebooted). It now replies using v4 but still no luck with external connecting in.

I have just been told that when this first happened, the tech on site first found that the Authentication box ticked under NIC Properties (802.1x where you can drop down and choose Smart Card or EAP).

This was obviously disabled but still no luck.

Thanik you for all your suggestions, keep them coming...has to be something missed.
No chance you have duplicate IP addesses running around the network?
Or hostnames....
Dont see why we would. These servers are on a different subnet to the "move" office so I doubt that anyone used an IP address my mistake.

Host names are pretty hard to duplicate by mistake too.
Well, just tried changing its IP to one verified as being free and it still doesnt work :(
So:

Affected server(s) can ping itself but nothing else. Nothing can ping the server(s).

Using an L2 switch can a workstation in the same subnet ping the server?
Can the server ping the workstation?
Are you pinging by IP or Host Name?

@pony10usL  beat me to it...
Using an L2 switch can a workstation in the same subnet ping the server?
Can the server ping the workstation?
Are you pinging by IP or Host Name?

No
No
Both

The server acts the same whether its on the real network or on a dumb switch.
Okay, everthing has been eliminated except the server iteself. Since the only update was to Windows Defender have you tried turning that off?

Also, make sure the gateway and mask are correct.

In your original question you said: "From the problem machines, you can ping anything and everything. Internet works". But then you say you can't ping another machine even on an L2 switch. This part confuses me.

I have tried disabling Windows Defender.

IP details are 100% correct.

To clarify:

On the problematic machines, they can seemingly operate as normal in terms of outbound network connections. They can ping the rest of the network, access the internet, etc.

But nothing can access them at all. Its exactly as if there is a firewall rejecting all inbound connections, but we know this isnt the case.

There is appears to be a contradiction here.
 
In post 35188136 you answered "No" about the server being able to ping a workstation on the the same subnet. Now you say in post 35190470 that it CAN ping the rest of the network.

Apolagies, got myself in a muddle. Last response is the correct one!
So let me try to put this together again.

1. Servers in London were moved
2. Servers in Liverpool are the ones that have an issue (4 of them)
3. Servers in Liverpool can ping all devices and access everything EXCEPT the servers in London
4. Nothing can access the servers in Liverpool
5. Have tried isolating the network in Liverpool by connecting them and some workstations to an L2 switch
6. The only updates were to Windows Defender signature files
7. Have updated the NIC drivers

Now my questions
1. Are there only 4 servers in Liverpool and ALL of them have the issue?
2. Were there any changes to IP addresses in London?
3. Where is the DNS server located?
4. Have you tried clearing the ARP table on the servers in Liverpool?
5. Can the servers in Liverpool ping or tracert the servers in London? (what is the result)
6. Can the servers in London ping or tracert the servers in Liverpool? (what is the result)
I like your style!

1. Yes
2. Today a server in London also has the same issue
3. Incorrect. Problem servers can even ping to London (and other sites they have). Just nothing can access them.
4. Nothing can access the problem servers. There are other servers in Liverpool with no issue, and they can be accessed fine.
5. Have patched all problem servers into L2 switch, all on same subnet, and the issue remains - they can ping out, but nothing can ping in.
6. Yes
7. Yes

1. No
2. No - additional IP's added, but no exisiting IP's changed.
3. Each site has their own DC(s) which also hosts the DNS
4. Not on the servers - I have cleared the ARP on the switches though.
5+6. Yes and No, depending on which servers...

Problematic servers can ping/trace everything, on any site as expected
No device can ping the problematic servers from anywhere.

Pings and Tracerts FROM problem servers TO anything work fine
Pings and Tracerts FROM anything TO problem servers time out.
This screams software firewall on the problem servers.

Can we uninstall defender on server and check services to make sure the windows firewall is not operational?
I have to agree wwakefield that is software related.  Something on the servers has to have changed.

One thing that I don't seem to recall being covered here is what OS version is on the problematic servers? We keep trying to ping them but if they have ICMP turned off they will not respond.
I agree, this is exactly what it feels like, but there are no visible firewalls on!

Windows Firewall is off, and disabled from starting in Services. Defender is disabled.

OS is Windows Server 2008 R2 x64. Using ping is just a way to test conectivity here...if the rest of the inbound connections worked then it wouldn't be as much of an issue.

Cant ping, trace, browse to (\\server or http), etc.

But yes, I agree, its almost as if it is just blanket denying all inbound packets.

One of the newly affected machines in London today was fresh out of the box. Literally had been on the network for 2-3 hours...then same thing happened to it.

The only 2 things I think it could be is:

1. A very intelligent virus/malware program that is blocking all inbound like a firewall, but unable to detect from Malwarebytes/Trend Online/Mcafee scans
2. Something is flying round the network and damaging the servers to act like this...but what?

Going to email techs to try repairing the winsock and tcp stacks now.
I don't know if this will help but is the RPC service running on the servers?  

http://askbobrankin.com/rpc_server_unavailable.html
Do you have teaming configured on the servers?
Do you have secondary IP addresses configured on the servers?
Do you have VLAN trunking configured on the servers?
Fixed....

Re-enabled the Windows Firewall service, but kept Windows Firewall off.

Dont know how or why. We disbaled the service after some servers kept mysteriously re-enabling their firewalls by themselves.
Glad to here that.  So it did all come back to the firewall. It had to be on the server(s) based on everything else that was tried.  
I dont get why the Windows Firewall service has to be enabled though?

Isnt that a bit backwards - disabled the firewall and it firewall's the connection?!
ASKER CERTIFIED SOLUTION
Avatar of Steven Carnahan
Steven Carnahan
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I've awarded points for your contribution on the matter and the explanantion of why this happened.

Thanks to all.