• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 3465
  • Last Modified:

Windows 2003 server stops serving file shares- Cannot Remote desktop - Out of resources

Windows 2003 server file service problems. Large file transfers not possible after a few hours of
heavy use, Out of memory error on client's. Cannot remote desktop to machine. Must reboot machine.
2003 Server std, Only happens after heavy load. Serving up 4TB of data. Most data is housed on SAN.
No error's reported from server. Hardware is new. Network has over 1000 nodes on one flat switched  LAN. Anti virus has been removed, Only other 3rd party app is Backup Exec.

This follows a previous problem when the previous server (different hardware) just stopped
serving network shares. Reboot and it would serve files for a few monutes then stop. Packet traces
showed the server would not respond to clients handshake request. PSS contacted and could not
identify problem. Was traced to server service stopping. Now we have the above problem with a new server 4GB RAM. We have at peak ~400 open file sessions. Is 2003 server not up to the task??

Basic file I/O should not be this difficult yes? We had a Windows 2000 server box serve this same
data reliably for 5 years. We are running win 2003 R2 SP2.

Thanks for any insight

 
0
welly192
Asked:
welly192
1 Solution
 
Erik PittiCommented:
This sounds like a memory leak.  A few questions:

Are you running Windows Server with the /3GB switch?  

Which Windows Edition, and processor architecture x86 or x64?

Have you tried running perfmon and watching the following counters?:

Memory/Free System Page Table Entries
Memory/System Driver Resident Bytes (would help find a memory leak in a driver)
Memory/Available MBytes


(These would be useful in troubleshooting a leak in the server service, but I doubt you'd find anything esp. after dealing with PSS.)
Server/Blocking Requests Rejected (used with the Work Item Shortages to track work Items)
Server/Errors System
Server/Files Open
Server/Sessions
Server/Work Item Shortages
Server/Pool Nonpaged Bytes
Server/Pool Nonpaged Failures
Server/Pool Paged Bytes
Server/Pool Paged Failures




0
 
Erik PittiCommented:
0
 
Netman66Commented:
It also sounds like either a bad HBA or bad drivers for it.

You can try enabling the Firewall Service on the server but disabling the Firewall itself.

0
Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
welly192Author Commented:
Hi, Thanks for responding

Are you running Windows Server with the /3GB switch?  
No 3Gb switch.
Which Windows Edition, and processor architecture x86 or x64?
X86

The HBA drivers are current and we are showing no errors on SAN and
local I/O between LUNS is fine. We are also serving up 256 file shares.
and the current system load never passes 15-20%. Full backups ran over the weekend
without any issues.   The HBA is common between both servers though and just prior
to the original failing we did have a SAN switch failure hat this HBA was attached into.
SIngle path attached. I will investigate this in addition to the performance monitors.

Thanks!


0
 
grenadeCommented:
try these steps also

1) Create/edit these values in the registry:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory
Management
"PoolUsageMaximum"=dword:00000030
"PagedPoolSize"=dword:ffffffff
2) Restart.
0
 
welly192Author Commented:
Here is the returned msg we get from clients. This only happens on
larger files.  There are plenty of references of this error on google
that appeared to be fixed with previous service packs.
Whenever we get this error we cannot rdp to the box.

Copying c:\NIGHTLY\Wed to e:\Wed at  7:27:25.10
File creation error - Not enough server storage is available to process this command.
0
 
AndrewCinkCommented:
On that local PC, does it have sufficient free drive space? It almost sounds like the server doesn't have enough storage space to cache all the data. Maybe the swap file isn't big enough? Sounds like it's related to drive storage on that PC somehow...
0
 
Erik PittiCommented:
I refer back to the following perfmon counters
(These would be useful in troubleshooting a leak in the server service, but I doubt you'd find anything esp. after dealing with PSS.)
Server/Blocking Requests Rejected (used with the Work Item Shortages to track work Items)
Server/Errors System
Server/Files Open
Server/Sessions
Server/Work Item Shortages
0
 
Erik PittiCommented:
Specifically this item:

Work Item Shortages
      
Shows the number of times that no work item was available or could be allocated to service the incoming request. A work item is the location where the server stores an SMB. The amount available fluctuate between a minimum and maximum value configured based on how the server is configured and the amount of memory on the computer. If work item shortages are occurring, it may be caused by an overloaded server. If the Work Item Shortages counter value is increasing, consider changing the registry value. HKEY_LOCAL_MACHINE\ SYSTEM\CurrentControlSet\ Services\LanmanServer\Parameters \Maxworkitems. Allowing this value to be achieved consistently initiates flow control, which hurts performance. This value is always 0 in the Blocking Queue instance.

http://www.microsoft.com/technet/prodtechnol/windows2000serv/reskit/counters/counters2_tdaf.mspx?mfr=true
0
 
welly192Author Commented:
Thanks again for your response, You have provided valuable insight. We are monitoring the
perfmon counters but have not had the problem reporoduce yet. We have checked the HBA's
and ran tests against them sent to the vendor and they ceritify the drivers are the latest and
fully operational with no problems. I will report back when we know more, this is a bizzarre
problem.


0
 
Erik PittiCommented:
Truly bizarre, let us know what you find.
0
 
welly192Author Commented:
Problem just occured again. We have been running ok since Monday PM.
We cannot locate anything unusual in the log.

I know this is a lot to ask, but if you could take a look and see if anything looks
suspicious? http://www.sdfishing.com/bite/serverperfmon.zip 
Specifially 0 on the work item shortages. Currently we cannot rdp into the
box and large file xfers fail. THis behavior may disappear and shortly we may be
able to access via rdp again and the problem dissappears. This is worth way more than
500 points!  
0
 
Erik PittiCommented:
reviewing now
0
 
Erik PittiCommented:
Not seeing anything out of the ordinary in the log.  It all looks okay.

Are there any SVCHOST.EXE related errors in the System or Application Log?
Have you run Windows Update recently?  If not, could you?  There's certain circumstances where the automatic updates or windows installer service dies while installing a hotfix and crashes the SVCHOST.EXE process which brings down the RDP services (Remote Desktop and Remote Assistance) as well as any number of other services that run in the SVCHOST process (like Computer Browser or Server).

You could also try running Microsoft's Server Performance Advisor:
http://www.microsoft.com/downloads/details.aspx?FamilyID=09115420-8c9d-46b9-a9a5-9bffcd237da2&DisplayLang=en
0
 
welly192Author Commented:
No svchost.exe errors.
Windows Update probably has not been run since it got all the updates 2-3 weeks ago when the server was set up. RDP services are still running and are responsive at times, but not responsive during others. We also see other issues besides the RDP denial during these “down times” like the server not having enough resources to load the local admin’s profile when logging in at the console, etc.
0
 
Erik PittiCommented:
Truly odd behavior.  The only other thing that I can think of would be that the network interface is overwhelmed at the time that RDP is unavailable, especially if RDP comes back without a reboot.

You could try the server performance advisor which may be able to help by collecting and analyzing a collection of performance counters.  Otherwise I'm at a loss, but I'll keep digging.

Diagnosing Server Performance Problems With Server Performance Advisor
http://www.windowsnetworking.com/articles_tutorials/Diagnosing-Server-Performance-Problems-Server-Performance-Advisor.html
0
 
welly192Author Commented:
This makes total sense to me, but is hard to get objective data from the network
They take 5 minute mrtgreadings but we need peak and variance.  We have a flat
switched Lan and packet captures show all kinds of different protocols and
broadcasts. We ran ethereal on both the client and the server at the time we saw nothing
from the client but we saw some packets as being reassembled on the server side
and the packet size was less than the MTU. SO that was curious.
0
 
Erik PittiCommented:
You could use perfmon for monitoring utilization and error rate on the network interface, although you cannot get the exact detail like you could with ethereal/wireshark.
0
 
welly192Author Commented:
Although the problem still exists I wanted to insure you got credit for your excellent advice.
The performance log was inconclusive, but it is a valuable tool. We are still having the issue
and have decided to bail on this server and have ordered the MS software storage server
NAS appliance that is supposesd to be optimized for file I/O and services.

Thanks again
Ray
0
 
Erik PittiCommented:
Thanks, Ray!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now