knightee
asked on
SMB Issues affecting Windows 7 machines at Gigabit speed
Hi Experts,
This is my first post, so many many thanks in advance for any help with this :)
A little background... We have a customer running a network comprised of 2 older Windows XP machines (both with Realtek 100Mbps cards) and 3 Windows 7 machines (1 x86 home premium, 2 x64 Pro, and each with a Realtek PCIe Gigabit card). Each of the machines is connected into a Cisco SG300-10 switch which is connected to a Cisco RV220W router. One of the XP machines acts as a server, hosting a SQL 2005 Express (SP4) database and server application for a property management app installed as a client on each of the other four machines, along with a number of network shares (setup by another company some time ago to use authenticated users access rather than using workgroup accounts, so anyone can access from the network). Around 8 weeks ago, they started to get application crashes and timeouts affecting each workstation, accompanied by very poor network performance affecting the Windows 7 boxes only. After investigating the database and app, we solved the actual crashes which were due to the app being outdated patch-wise and the AUTO_CLOSE feature being active on the database itself, but we're now left with the performance issue, which leaves each of the 7 machines unable to use anything beyond 1-2% of their Gigabit links (though all the while, the XP machines at 100Mbps are running perfectly). Whenever the problem had occurred originally, I'd noticed that the Windows 7 boxes could no longer see the Windows XP machines and the first couple of times a restart of the router restored everything, but the speed at which the network degrades from performing well to almost useless has increased each time until these machines are now no longer able to operate at Gigabit speeds at all, and strangely this occurs even though all machines are now visible (I DHCP reserved each machine and set each to netbios enabled, instead of netbios/DHCP setting). They're up and running with acceptable performance now as I've forced the ports to 100Mbps on the switch, which instantly gives them 10-25Mbps bandwidth when copying files from the server.
I've watched both the disk queue/% util on the server vs the network traffic on the workstation, and what traffic there is between them seems to come in small bursts, with huge delay between. Again, back to 100Mbps though, traffic and disk operating normally...
What I've tried so far (have undone each change after it failed to help)...
- Update drivers from Realtek site on each of the Gb cards.
- Disable auto-negotiation both at the card and the switch.
- Ran the port/cable copper test from the switch, which picks up nothing.
- Full AV scan (each machine is using MSE, fully updated).
- Disable AV.
- Disable FW (Windows FW on each machine).
- Disable NIC flow control.
- Disable checksum and large send offloading (found an MS engineers article relating to this exact problem, on the same network cards, but unfortunately, no change).
- Max values for send/receive buffers.
- Disable all power management features both on the NICs and the switch (left this change intact).
- Monitored server disk performance, checked Windows event logs for issues. Server machine seems in great health and performs nicely at the terminal or remotely.
- Disabled TCP autotuning on the 7 boxes. Tried this yesterday, and this instantly brought the network back to 25% of Gb bandwidth, but for the last time; haven't got above 1-2% since, even with this change undone.
EDIT - Forgot, I'd also tried disabling remote differential compression, also no improvement.
Please excuse the long drawn out descriptions, but I'm keen to not miss anything.
I've done a capture using netmon 3.4, which I've filtered down to errors only and I found the exchange below (test was navigating through a share and attempting to download an 86MB SQL .bak file to the 7 home premium machine as a test) though I'm coming to the end of my experience level I'm afraid, and googling some of the entries hasn't produced anything that seems applicable. Has anyone come across this issue before? Additionally, if anyone knows of any good background documentation describing how this exchange should look, I'd be very keen to read it :)
Many thanks, Andy.
424 23:28:07 10/01/2013 4.9357572 213.120.234.2 7WORKSTATION DNS DNS:QueryId = 0x49E8, QUERY (Standard query), Response - Name Error {DNS:24, UDP:23, IPv4:22}
453 23:28:08 10/01/2013 5.2416769 System XPSERVER 7WORKSTATION SMB SMB:R; Session Setup Andx, NTLM CHALLENGE MESSAGE - NT Status: System - Error, Code = (22) STATUS_MORE_PROCESSING_REQ UIRED {NbtSS:33, TCP:32, IPv4:30}
753 23:28:12 10/01/2013 9.6661761 System XPSERVER 7WORKSTATION SMB SMB:R; Session Setup Andx, NTLM CHALLENGE MESSAGE - NT Status: System - Error, Code = (22) STATUS_MORE_PROCESSING_REQ UIRED {NbtSS:33, TCP:32, IPv4:30}
773 23:28:12 10/01/2013 9.6738393 System XPSERVER 7WORKSTATION SRVS SRVS:NetrShareGetInfo Response, Status=ERROR_SUCCESS {MSRPC:49, SMB:48, NbtSS:33, TCP:32, IPv4:30}
785 23:28:12 10/01/2013 9.6948848 System XPSERVER 7WORKSTATION SMB SMB:R; Nt Create Andx - NT Status: System - Error, Code = (52) STATUS_OBJECT_NAME_NOT_FOU ND {SMB:51, NbtSS:33, TCP:32, IPv4:30}
807 23:28:12 10/01/2013 9.8176568 System XPSERVER 7WORKSTATION SMB SMB:R; Nt Create Andx - NT Status: System - Error, Code = (34) STATUS_ACCESS_DENIED {SMB:54, NbtSS:33, TCP:32, IPv4:30}
813 23:28:12 10/01/2013 9.8326293 System XPSERVER 7WORKSTATION SMB SMB:R; Nt Create Andx - NT Status: System - Error, Code = (34) STATUS_ACCESS_DENIED {SMB:57, NbtSS:33, TCP:32, IPv4:30}
820 23:28:12 10/01/2013 9.8441916 System XPSERVER 7WORKSTATION SMB SMB:R; Nt Create Andx - NT Status: System - Error, Code = (34) STATUS_ACCESS_DENIED {SMB:60, NbtSS:33, TCP:32, IPv4:30}
822 23:28:12 10/01/2013 9.8450788 System XPSERVER 7WORKSTATION SMB SMB:R; Transact2, Query File Info - NT Status: System - Error, Code = (8) STATUS_INVALID_HANDLE {SMB:61, NbtSS:33, TCP:32, IPv4:30}
824 23:28:12 10/01/2013 9.8450788 System XPSERVER 7WORKSTATION SMB SMB:R; Nt Create Andx - NT Status: System - Error, Code = (34) STATUS_ACCESS_DENIED {SMB:62, NbtSS:33, TCP:32, IPv4:30}
828 23:28:12 10/01/2013 9.8458057 System XPSERVER 7WORKSTATION SMB SMB:R; Transact2, Query File Info - NT Status: System - Error, Code = (8) STATUS_INVALID_HANDLE {SMB:61, NbtSS:33, TCP:32, IPv4:30}
832 23:28:12 10/01/2013 9.8460398 System XPSERVER 7WORKSTATION SMB SMB:R; Nt Create Andx - NT Status: System - Error, Code = (34) STATUS_ACCESS_DENIED {SMB:63, NbtSS:33, TCP:32, IPv4:30}
833 23:28:12 10/01/2013 9.8463740 System XPSERVER 7WORKSTATION SMB SMB:R; Transaction - NT Status: System - Error, Code = (52) STATUS_OBJECT_NAME_NOT_FOU ND {NbtSS:33, TCP:32, IPv4:30}
1436 23:28:20 10/01/2013 17.5308203 System XPSERVER 7WORKSTATION SMB SMB:R; Nt Create Andx - NT Status: System - Error, Code = (52) STATUS_OBJECT_NAME_NOT_FOU ND {SMB:69, NbtSS:33, TCP:32, IPv4:30}
1809 23:28:25 10/01/2013 22.3319204 System XPSERVER 7WORKSTATION SMB SMB:R; Nt Transact, NT_TRANSACT_NOTIFY_CHANGE, FID = 0xC00B - NT Status: System - Error, Code = (288) STATUS_CANCELLED {SMB:72, NbtSS:33, TCP:32, IPv4:30}
2392 23:28:30 10/01/2013 27.7335113 System XPSERVER 7WORKSTATION SMB SMB:R; Nt Transact, NT_TRANSACT_IOCTL, FID = 0xC006 - NT Status: System - Error, Code = (16) STATUS_INVALID_DEVICE_REQU EST {SMB:77, NbtSS:33, TCP:32, IPv4:30}
14014 23:30:27 10/01/2013 144.3685728 System XPSERVER 7WORKSTATION SMB SMB:R; Session Setup Andx, NTLM CHALLENGE MESSAGE - NT Status: System - Error, Code = (22) STATUS_MORE_PROCESSING_REQ UIRED {NbtSS:110, TCP:109, IPv4:30}
14028 23:30:27 10/01/2013 144.3931205 System XPSERVER 7WORKSTATION SMB SMB:R; Session Setup Andx, NTLM CHALLENGE MESSAGE - NT Status: System - Error, Code = (22) STATUS_MORE_PROCESSING_REQ UIRED {NbtSS:113, TCP:112, IPv4:30}
This is my first post, so many many thanks in advance for any help with this :)
A little background... We have a customer running a network comprised of 2 older Windows XP machines (both with Realtek 100Mbps cards) and 3 Windows 7 machines (1 x86 home premium, 2 x64 Pro, and each with a Realtek PCIe Gigabit card). Each of the machines is connected into a Cisco SG300-10 switch which is connected to a Cisco RV220W router. One of the XP machines acts as a server, hosting a SQL 2005 Express (SP4) database and server application for a property management app installed as a client on each of the other four machines, along with a number of network shares (setup by another company some time ago to use authenticated users access rather than using workgroup accounts, so anyone can access from the network). Around 8 weeks ago, they started to get application crashes and timeouts affecting each workstation, accompanied by very poor network performance affecting the Windows 7 boxes only. After investigating the database and app, we solved the actual crashes which were due to the app being outdated patch-wise and the AUTO_CLOSE feature being active on the database itself, but we're now left with the performance issue, which leaves each of the 7 machines unable to use anything beyond 1-2% of their Gigabit links (though all the while, the XP machines at 100Mbps are running perfectly). Whenever the problem had occurred originally, I'd noticed that the Windows 7 boxes could no longer see the Windows XP machines and the first couple of times a restart of the router restored everything, but the speed at which the network degrades from performing well to almost useless has increased each time until these machines are now no longer able to operate at Gigabit speeds at all, and strangely this occurs even though all machines are now visible (I DHCP reserved each machine and set each to netbios enabled, instead of netbios/DHCP setting). They're up and running with acceptable performance now as I've forced the ports to 100Mbps on the switch, which instantly gives them 10-25Mbps bandwidth when copying files from the server.
I've watched both the disk queue/% util on the server vs the network traffic on the workstation, and what traffic there is between them seems to come in small bursts, with huge delay between. Again, back to 100Mbps though, traffic and disk operating normally...
What I've tried so far (have undone each change after it failed to help)...
- Update drivers from Realtek site on each of the Gb cards.
- Disable auto-negotiation both at the card and the switch.
- Ran the port/cable copper test from the switch, which picks up nothing.
- Full AV scan (each machine is using MSE, fully updated).
- Disable AV.
- Disable FW (Windows FW on each machine).
- Disable NIC flow control.
- Disable checksum and large send offloading (found an MS engineers article relating to this exact problem, on the same network cards, but unfortunately, no change).
- Max values for send/receive buffers.
- Disable all power management features both on the NICs and the switch (left this change intact).
- Monitored server disk performance, checked Windows event logs for issues. Server machine seems in great health and performs nicely at the terminal or remotely.
- Disabled TCP autotuning on the 7 boxes. Tried this yesterday, and this instantly brought the network back to 25% of Gb bandwidth, but for the last time; haven't got above 1-2% since, even with this change undone.
EDIT - Forgot, I'd also tried disabling remote differential compression, also no improvement.
Please excuse the long drawn out descriptions, but I'm keen to not miss anything.
I've done a capture using netmon 3.4, which I've filtered down to errors only and I found the exchange below (test was navigating through a share and attempting to download an 86MB SQL .bak file to the 7 home premium machine as a test) though I'm coming to the end of my experience level I'm afraid, and googling some of the entries hasn't produced anything that seems applicable. Has anyone come across this issue before? Additionally, if anyone knows of any good background documentation describing how this exchange should look, I'd be very keen to read it :)
Many thanks, Andy.
424 23:28:07 10/01/2013 4.9357572 213.120.234.2 7WORKSTATION DNS DNS:QueryId = 0x49E8, QUERY (Standard query), Response - Name Error {DNS:24, UDP:23, IPv4:22}
453 23:28:08 10/01/2013 5.2416769 System XPSERVER 7WORKSTATION SMB SMB:R; Session Setup Andx, NTLM CHALLENGE MESSAGE - NT Status: System - Error, Code = (22) STATUS_MORE_PROCESSING_REQ
753 23:28:12 10/01/2013 9.6661761 System XPSERVER 7WORKSTATION SMB SMB:R; Session Setup Andx, NTLM CHALLENGE MESSAGE - NT Status: System - Error, Code = (22) STATUS_MORE_PROCESSING_REQ
773 23:28:12 10/01/2013 9.6738393 System XPSERVER 7WORKSTATION SRVS SRVS:NetrShareGetInfo Response, Status=ERROR_SUCCESS {MSRPC:49, SMB:48, NbtSS:33, TCP:32, IPv4:30}
785 23:28:12 10/01/2013 9.6948848 System XPSERVER 7WORKSTATION SMB SMB:R; Nt Create Andx - NT Status: System - Error, Code = (52) STATUS_OBJECT_NAME_NOT_FOU
807 23:28:12 10/01/2013 9.8176568 System XPSERVER 7WORKSTATION SMB SMB:R; Nt Create Andx - NT Status: System - Error, Code = (34) STATUS_ACCESS_DENIED {SMB:54, NbtSS:33, TCP:32, IPv4:30}
813 23:28:12 10/01/2013 9.8326293 System XPSERVER 7WORKSTATION SMB SMB:R; Nt Create Andx - NT Status: System - Error, Code = (34) STATUS_ACCESS_DENIED {SMB:57, NbtSS:33, TCP:32, IPv4:30}
820 23:28:12 10/01/2013 9.8441916 System XPSERVER 7WORKSTATION SMB SMB:R; Nt Create Andx - NT Status: System - Error, Code = (34) STATUS_ACCESS_DENIED {SMB:60, NbtSS:33, TCP:32, IPv4:30}
822 23:28:12 10/01/2013 9.8450788 System XPSERVER 7WORKSTATION SMB SMB:R; Transact2, Query File Info - NT Status: System - Error, Code = (8) STATUS_INVALID_HANDLE {SMB:61, NbtSS:33, TCP:32, IPv4:30}
824 23:28:12 10/01/2013 9.8450788 System XPSERVER 7WORKSTATION SMB SMB:R; Nt Create Andx - NT Status: System - Error, Code = (34) STATUS_ACCESS_DENIED {SMB:62, NbtSS:33, TCP:32, IPv4:30}
828 23:28:12 10/01/2013 9.8458057 System XPSERVER 7WORKSTATION SMB SMB:R; Transact2, Query File Info - NT Status: System - Error, Code = (8) STATUS_INVALID_HANDLE {SMB:61, NbtSS:33, TCP:32, IPv4:30}
832 23:28:12 10/01/2013 9.8460398 System XPSERVER 7WORKSTATION SMB SMB:R; Nt Create Andx - NT Status: System - Error, Code = (34) STATUS_ACCESS_DENIED {SMB:63, NbtSS:33, TCP:32, IPv4:30}
833 23:28:12 10/01/2013 9.8463740 System XPSERVER 7WORKSTATION SMB SMB:R; Transaction - NT Status: System - Error, Code = (52) STATUS_OBJECT_NAME_NOT_FOU
1436 23:28:20 10/01/2013 17.5308203 System XPSERVER 7WORKSTATION SMB SMB:R; Nt Create Andx - NT Status: System - Error, Code = (52) STATUS_OBJECT_NAME_NOT_FOU
1809 23:28:25 10/01/2013 22.3319204 System XPSERVER 7WORKSTATION SMB SMB:R; Nt Transact, NT_TRANSACT_NOTIFY_CHANGE,
2392 23:28:30 10/01/2013 27.7335113 System XPSERVER 7WORKSTATION SMB SMB:R; Nt Transact, NT_TRANSACT_IOCTL, FID = 0xC006 - NT Status: System - Error, Code = (16) STATUS_INVALID_DEVICE_REQU
14014 23:30:27 10/01/2013 144.3685728 System XPSERVER 7WORKSTATION SMB SMB:R; Session Setup Andx, NTLM CHALLENGE MESSAGE - NT Status: System - Error, Code = (22) STATUS_MORE_PROCESSING_REQ
14028 23:30:27 10/01/2013 144.3931205 System XPSERVER 7WORKSTATION SMB SMB:R; Session Setup Andx, NTLM CHALLENGE MESSAGE - NT Status: System - Error, Code = (22) STATUS_MORE_PROCESSING_REQ
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Hi knightee,
Yes, it is recommended to disable SMB 2.0 in mixed OS environments if you are experiencing issues - I don't have any specific documentation to hand, but I have seen Sage KBs recommending this, for example.
So as long as you have Win 7 clients connecting to an XP machine then I would disable SMB 2.0 on any Win7 boxes.
Also, I wouldn't concern yourself too much with the percentage network utilisation figure - troubleshoot based on your application performance.
Yes, it is recommended to disable SMB 2.0 in mixed OS environments if you are experiencing issues - I don't have any specific documentation to hand, but I have seen Sage KBs recommending this, for example.
So as long as you have Win 7 clients connecting to an XP machine then I would disable SMB 2.0 on any Win7 boxes.
Also, I wouldn't concern yourself too much with the percentage network utilisation figure - troubleshoot based on your application performance.
ASKER
Great, thanks a lot for your help. I'm going to review with the customer later, and potentially roll out to the other two machines (currently still at 100Mb to preserve some stability). I'll keep you posted.
Why did you disable flow control ? In such a configuration, I would enable it. It would prevent the 1Gbps NICs from flooding the network at 1Gbps while the 100Mbps NICs cannot consume this flow...
Remember that you have to enable it on NICs and switches.
Also, upgrading the switches firmware is certainly a good idea.
And yes, I have seen such issues that were always resolved by... enabling flow control !
Remember that you have to enable it on NICs and switches.
Also, upgrading the switches firmware is certainly a good idea.
And yes, I have seen such issues that were always resolved by... enabling flow control !
ASKER
Hi Vivigatt,
Flow control is enabled, both at the switch and the NICs, and the router/switch firmware are both current.
I only tested with flow control disabled as I'd seen it suggested elsewhere as a possible fix; but I undid the change after testing.
So far so good anyway; customer confirms the test box has been operating nicely with SMB2 disabled. Assuming the other two do, it's looking good.
Andy.
Flow control is enabled, both at the switch and the NICs, and the router/switch firmware are both current.
I only tested with flow control disabled as I'd seen it suggested elsewhere as a possible fix; but I undid the change after testing.
So far so good anyway; customer confirms the test box has been operating nicely with SMB2 disabled. Assuming the other two do, it's looking good.
Andy.
ASKER
Looking good so far. Thanks a lot for your help all!
ASKER
I haven't re-disabled autotuning or checksum/large send offloading yet as I don't want to muddy the waters while we're testing. If the SMB2 disabling provides a long lasting solution, would you recommend disabling the above as a matter of course for this environment?