Link to home
Start Free TrialLog in
Avatar of knightee
knighteeFlag for United Kingdom of Great Britain and Northern Ireland

asked on

SMB Issues affecting Windows 7 machines at Gigabit speed

Hi Experts,
This is my first post, so many many thanks in advance for any help with this :)
A little background... We have a customer running a network comprised of 2 older Windows XP machines (both with Realtek 100Mbps cards) and 3 Windows 7 machines (1 x86 home premium, 2 x64 Pro, and each with a Realtek PCIe Gigabit card). Each of the machines is connected into a Cisco SG300-10 switch which is connected to a Cisco RV220W router. One of the XP machines acts as a server, hosting a SQL 2005 Express (SP4) database and server application for a property management app installed as a client on each of the other four machines, along with a number of network shares (setup by another company some time ago to use authenticated users access rather than using workgroup accounts, so anyone can access from the network). Around 8 weeks ago, they started to get application crashes and timeouts affecting each workstation, accompanied by very poor network performance affecting the Windows 7 boxes only. After investigating the database and app, we solved the actual crashes which were due to the app being outdated patch-wise and the AUTO_CLOSE feature being active on the database itself, but we're now left with the performance issue, which leaves each of the 7 machines unable to use anything beyond 1-2% of their Gigabit links (though all the while, the XP machines at 100Mbps are running perfectly). Whenever the problem had occurred originally, I'd noticed that the Windows 7 boxes could no longer see the Windows XP machines and the first couple of times a restart of the router restored everything, but the speed at which the network degrades from performing well to almost useless has increased each time until these machines are now no longer able to operate at Gigabit speeds at all, and strangely this occurs even though all machines are now visible (I DHCP reserved each machine and set each to netbios enabled, instead of netbios/DHCP setting). They're up and running with acceptable performance now as I've forced the ports to 100Mbps on the switch, which instantly gives them 10-25Mbps bandwidth when copying files from the server.
I've watched both the disk queue/% util on the server vs the network traffic on the workstation, and what traffic there is between them seems to come in small bursts, with huge delay between. Again, back to 100Mbps though, traffic and disk operating normally...
What I've tried so far (have undone each change after it failed to help)...
- Update drivers from Realtek site on each of the Gb cards.
- Disable auto-negotiation both at the card and the switch.
- Ran the port/cable copper test from the switch, which picks up nothing.
- Full AV scan (each machine is using MSE, fully updated).
- Disable AV.
- Disable FW (Windows FW on each machine).
- Disable NIC flow control.
- Disable checksum and large send offloading (found an MS engineers article relating to this exact problem, on the same network cards, but unfortunately, no change).
- Max values for send/receive buffers.
- Disable all power management features both on the NICs and the switch (left this change intact).
- Monitored server disk performance, checked Windows event logs for issues. Server machine seems in great health and performs nicely at the terminal or remotely.
- Disabled TCP autotuning on the 7 boxes. Tried this yesterday, and this instantly brought the network back to 25% of Gb bandwidth, but for the last time; haven't got above 1-2% since, even with this change undone.
EDIT - Forgot, I'd also tried disabling remote differential compression, also no improvement.

Please excuse the long drawn out descriptions, but I'm keen to not miss anything.
I've done a capture using netmon 3.4, which I've filtered down to errors only and I found the exchange below (test was navigating through a share and attempting to download an 86MB SQL .bak file to the 7 home premium machine as a test) though I'm coming to the end of my experience level I'm afraid, and googling some of the entries hasn't produced anything that seems applicable. Has anyone come across this issue before? Additionally, if anyone knows of any good background documentation describing how this exchange should look, I'd be very keen to read it :)
Many thanks, Andy.

424      23:28:07 10/01/2013      4.9357572            213.120.234.2      7WORKSTATION       DNS      DNS:QueryId = 0x49E8, QUERY (Standard query), Response - Name Error       {DNS:24, UDP:23, IPv4:22}
453      23:28:08 10/01/2013      5.2416769      System      XPSERVER        7WORKSTATION       SMB      SMB:R; Session Setup Andx, NTLM CHALLENGE MESSAGE - NT Status: System - Error, Code = (22) STATUS_MORE_PROCESSING_REQUIRED      {NbtSS:33, TCP:32, IPv4:30}
753      23:28:12 10/01/2013      9.6661761      System      XPSERVER        7WORKSTATION       SMB      SMB:R; Session Setup Andx, NTLM CHALLENGE MESSAGE - NT Status: System - Error, Code = (22) STATUS_MORE_PROCESSING_REQUIRED      {NbtSS:33, TCP:32, IPv4:30}
773      23:28:12 10/01/2013      9.6738393      System      XPSERVER        7WORKSTATION       SRVS      SRVS:NetrShareGetInfo Response, Status=ERROR_SUCCESS      {MSRPC:49, SMB:48, NbtSS:33, TCP:32, IPv4:30}
785      23:28:12 10/01/2013      9.6948848      System      XPSERVER        7WORKSTATION       SMB      SMB:R; Nt Create Andx - NT Status: System - Error, Code = (52) STATUS_OBJECT_NAME_NOT_FOUND      {SMB:51, NbtSS:33, TCP:32, IPv4:30}
807      23:28:12 10/01/2013      9.8176568      System      XPSERVER        7WORKSTATION       SMB      SMB:R; Nt Create Andx - NT Status: System - Error, Code = (34) STATUS_ACCESS_DENIED      {SMB:54, NbtSS:33, TCP:32, IPv4:30}
813      23:28:12 10/01/2013      9.8326293      System      XPSERVER        7WORKSTATION       SMB      SMB:R; Nt Create Andx - NT Status: System - Error, Code = (34) STATUS_ACCESS_DENIED      {SMB:57, NbtSS:33, TCP:32, IPv4:30}
820      23:28:12 10/01/2013      9.8441916      System      XPSERVER        7WORKSTATION       SMB      SMB:R; Nt Create Andx - NT Status: System - Error, Code = (34) STATUS_ACCESS_DENIED      {SMB:60, NbtSS:33, TCP:32, IPv4:30}
822      23:28:12 10/01/2013      9.8450788      System      XPSERVER        7WORKSTATION       SMB      SMB:R; Transact2, Query File Info - NT Status: System - Error, Code = (8) STATUS_INVALID_HANDLE      {SMB:61, NbtSS:33, TCP:32, IPv4:30}
824      23:28:12 10/01/2013      9.8450788      System      XPSERVER        7WORKSTATION       SMB      SMB:R; Nt Create Andx - NT Status: System - Error, Code = (34) STATUS_ACCESS_DENIED      {SMB:62, NbtSS:33, TCP:32, IPv4:30}
828      23:28:12 10/01/2013      9.8458057      System      XPSERVER        7WORKSTATION       SMB      SMB:R; Transact2, Query File Info - NT Status: System - Error, Code = (8) STATUS_INVALID_HANDLE      {SMB:61, NbtSS:33, TCP:32, IPv4:30}
832      23:28:12 10/01/2013      9.8460398      System      XPSERVER        7WORKSTATION       SMB      SMB:R; Nt Create Andx - NT Status: System - Error, Code = (34) STATUS_ACCESS_DENIED      {SMB:63, NbtSS:33, TCP:32, IPv4:30}
833      23:28:12 10/01/2013      9.8463740      System      XPSERVER        7WORKSTATION       SMB      SMB:R; Transaction - NT Status: System - Error, Code = (52) STATUS_OBJECT_NAME_NOT_FOUND      {NbtSS:33, TCP:32, IPv4:30}
1436      23:28:20 10/01/2013      17.5308203      System      XPSERVER        7WORKSTATION       SMB      SMB:R; Nt Create Andx - NT Status: System - Error, Code = (52) STATUS_OBJECT_NAME_NOT_FOUND      {SMB:69, NbtSS:33, TCP:32, IPv4:30}
1809      23:28:25 10/01/2013      22.3319204      System      XPSERVER        7WORKSTATION       SMB      SMB:R; Nt Transact, NT_TRANSACT_NOTIFY_CHANGE, FID = 0xC00B - NT Status: System - Error, Code = (288) STATUS_CANCELLED      {SMB:72, NbtSS:33, TCP:32, IPv4:30}
2392      23:28:30 10/01/2013      27.7335113      System      XPSERVER        7WORKSTATION       SMB      SMB:R; Nt Transact, NT_TRANSACT_IOCTL, FID = 0xC006 - NT Status: System - Error, Code = (16) STATUS_INVALID_DEVICE_REQUEST      {SMB:77, NbtSS:33, TCP:32, IPv4:30}
14014      23:30:27 10/01/2013      144.3685728      System      XPSERVER        7WORKSTATION       SMB      SMB:R; Session Setup Andx, NTLM CHALLENGE MESSAGE - NT Status: System - Error, Code = (22) STATUS_MORE_PROCESSING_REQUIRED      {NbtSS:110, TCP:109, IPv4:30}
14028      23:30:27 10/01/2013      144.3931205      System      XPSERVER        7WORKSTATION       SMB      SMB:R; Session Setup Andx, NTLM CHALLENGE MESSAGE - NT Status: System - Error, Code = (22) STATUS_MORE_PROCESSING_REQUIRED      {NbtSS:113, TCP:112, IPv4:30}
ASKER CERTIFIED SOLUTION
Avatar of Member_2_6515809
Member_2_6515809

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of knightee

ASKER

Thanks BlueCompute, promising progress. I've disabled SMB2 on one of the 7 boxes and after the restart, the same 86MB bak file transferred in a few seconds, with the utilization peaking at around 12-14% at Gb connection speed.
I haven't re-disabled autotuning or checksum/large send offloading yet as I don't want to muddy the waters while we're testing. If the SMB2 disabling provides a long lasting solution, would you recommend disabling the above as a matter of course for this environment?
Avatar of Member_2_6515809
Member_2_6515809

Hi knightee,

Yes, it is recommended to disable SMB 2.0 in mixed OS environments if you are experiencing issues - I don't have any specific documentation to hand, but I have seen Sage KBs recommending this, for example.

So as long as you have Win 7 clients connecting to an XP machine then I would disable SMB 2.0 on any Win7 boxes.

Also, I wouldn't concern yourself too much with the percentage network utilisation figure - troubleshoot based on your application performance.
Great, thanks a lot for your help. I'm going to review with the customer later, and potentially roll out to the other two machines (currently still at 100Mb to preserve some stability). I'll keep you posted.
Why did you disable flow control ? In such a configuration, I would enable it. It would prevent the 1Gbps NICs from flooding the network at 1Gbps while the 100Mbps NICs cannot consume this flow...
Remember that you have to enable it on NICs and switches.
Also, upgrading the switches firmware is certainly a good idea.

And yes, I have seen such issues that were always resolved by... enabling flow control !
Hi Vivigatt,
Flow control is enabled, both at the switch and the NICs, and the router/switch firmware are both current.
I only tested with flow control disabled as I'd seen it suggested elsewhere as a possible fix; but I undid the change after testing.
So far so good anyway; customer confirms the test box has been operating nicely with SMB2 disabled. Assuming the other two do, it's looking good.
Andy.
Looking good so far. Thanks a lot for your help all!