Link to home
Start Free TrialLog in
Avatar of mrmut
mrmutFlag for Croatia

asked on

Network aplications slow on Microsoft Windows Network, XP, SBS 2003

Hi,

   I have a problem that I can't fully identify, and for which I've spent a lot of time testing and debugging, and nothing helped.

   Computer network description:

   Freshly (6 months) installed computer network, freshly wired network, managed 3Com switch, new IBM server x3200 series, all clients have XP on them. All service packs and updates are installed, all network card drivers are latest versions.

   The problem is with two accounting applications that communicate with the server over shared folder mapped as a hard drive in My Computer. Both programs are flat file database based.

   The problem is when programs slow down. Something happens and both of them become unbelievably slow, and then clients start calling me.

   On all of the computers ESET NOD32 is installed, on the server NOD32 antivirus application options everything is disabled except file checking, and on the client computers options are left default.

   There is SBS Domain installed in the network, and all computers log in in that network. There is no folder redirection.

   Even Viewer:

Applications: No problems, except yesterday there were some problems with Exchange. I have installed some updates, and restarted and everything is OK now.

Directory service: All OK.

DNS Server: No problems.

File replication services: OK

Internet Explorer and Security: All OK.

System: All OK, except some printer problems with driver not installed on server. Mainly have connection with me connecting to terminal administration console.

netcps gives median transfer of about 32 megabytes per second on gigabit clients (All).

There is no firewall application installed, except the one built in SBS 2003.

Computer is defragmented, users normally use their files located on the share with no problem or speed slowdowns.

I have even gone so far to move all users from normal wired network to completely new CAT5 cables built in the electronic shop. Everything worked fine after I've installed that cables, and now people are again reporting slow performance.

The "SLOW" is defined by users click on some operation in in a program, and then everything halts for a period of time. That is extremely annoying and occurs on all computers in the network.

When I thought it is the Switch, I've replaced it with another one, and the problem persisted.

On the server there is Broadcom NetXtreme Gigabit Ethernet LAN card installed, newest IBM drivers, there is a screen shot attached of how the card is configured.

Other card is D-Link used for access to a ZyXel aDSL router that connects the LAN to the network. The Internet is over the SBS server.

The server is sufficiently powered, with two mirrored arrays, Dial Core Xeon, 2 GB of RAM.

Every computer and all network equipment is connected to an APC ups, server, switch and router to the Smart UPS that delivers pure sinewave and corrects the current.


I don't know what to add more, except - please help!

This problem is driving me nuts!


screen1.jpg
Avatar of PeteJThomas
PeteJThomas
Flag of United Kingdom of Great Britain and Northern Ireland image

Two quick things I'd like to know first - Do ALL the clients start running slow at about the same time, and continue doing so until you take some sort of action (server reboot or whatever)? Or does it happen to single random clients whilst others are still running fine?

And secondly, when all is running incredibly slow, what is the status of the server performance wise? i.e. have you checked mem and cpu utilisation during these periods, to ensure there is plenty of memory and cpu resources available to the server?

I know it seems obvious, but I didn't notice it stated above, so thought I'd ask before doing anything else!

Thanks,

Pete
Avatar of mrmut

ASKER

1. Yes.

2. Practically ide. No problems whatsoever.


I know it sonds stupid, but I am completely short out of ideas.

Newest thing I am trying (starting today) is to get one client to work locally on the terminal to see will all work OK.


Thank you!
Avatar of mrmut

ASKER

NEW INFO!

The client logged into the server to his remote desktop. The started the application and everything worked great; snappy and fast, but at one moment clicking on one command caused the slowdown.

The slowdown manifested itself with rising System processes to 50% (one whole CPU) while the application functioned with normal load of about 0% to 4%.

Any ideas?

Thank you,

Borislav
Did the system process drop back to normal on it's own or require intervention to bring it back down again?

And was this command anything particularly intensive, a large DB query etc?
Avatar of mrmut

ASKER

Yes, it did drop back to normal without any intervention. The CPU load rise is even hard to catch, as does pass by later on and everything continues to work to a second slowdown.

The command was certainly not intensive - the problem happens from time to time. The command was exiting the one tab of the program.
You could try using a tool like Sysinternals Process Monitor (downloadable from the MS website) to get a better idea of what's happening with the processes when the problem occurs?

Not really sure where to go though, if something the application is doing is causing it to drain all CPU resources for 1 of the 2 cores, it would suggest some kind of problem with the app itself, but I'm afraid I don't have much experience with DB apps...

Might be an idea to try proc mon though, unless someone else comes along with a better idea?

Pete
Avatar of mrmut

ASKER

OK - new info;

   it seems that some system resource is periodically eating one CPU. Unfortunately, I can not say which one. - Is there any way to detect what is the problem?

   *Note that users are using the File Shares during my overviews. However, with normal fileshare use, with dedicated RAID and server class LAN card, no file transfer should take more than few percent, even with Antivirus running at 100% all the time.

   And as I've mentioned later - AV program is not using practically any CPU time, few percent at most.
Avatar of mrmut

ASKER

OK - there is even newer info;

   for some reason SQL server occasionally uses 50+ CPU. This incident is different from earlier one, but it does rise both System and SQL process.

   Screenshot attached.
Avatar of mrmut

ASKER

Sorry, here is the screenshot.
highCPU.jpg
Avatar of mrmut

ASKER

Here is what perfmon catched.

What the heck is happening? - It seems that some process, supposedly SQL or sth. is taking CPU time with high priority.

Any idea how can I find out what is happening?


Thanks.
problem.jpg
Hi, is that the chart for just one of the cores? Or can I take it that the 50% jumps are basically 1 core at 100%?

I've seen on servers before, and was told at the time that it some apps do this periodically when carrying out periodical tasks such as indexing - An index running in a large DB will often cause CPU spikes like that, and if what I was told is true then it's nothing to worry about and perfectly normal etc...

Only periods I'm concerned about are the ones where it lasts a little longer, as the spikes I refer to above should be no more than a few seconds(ish). And if it's eating an entire core for several minutes, and that happens to be the only core than  SQL is using (again apologies but I don't know much about SQL so not sure if it happily multi-threads without any extra config etc) then it would obviously cause large detriment to performance for those few mins, then go back to normal...

Is any of what I've said making any sense?

Pete
Avatar of mrmut

ASKER

Well, yes - but I have concluded so even before.

The chart is for both cores, 50% is one entire core, and the pattern repeats itself, causing BIG problems.

Now - I really don't have a clue what is behind, as the server shouldn't be doing anything at all in the meantime.

Do you have any suggestion?


I'm afraid you'll probably need someone with more knowledge of SQL to figure out what it's doing... As I say I think it's normal for it to continuously spike (it may not happen on everyones SQL box but I don't think that it's anything to worry about) but the longer periods that last several minutes are where the problem is...

You probably need an SQL guru to make some suggestions, as I simply don't have enough experience with it...

Take a look at this too, another person with a similar problem - https://www.experts-exchange.com/questions/22868929/What's-the-cause-of-my-frequent-50-CPU-usage-in-1-of-3-instances-of-SQL-2000.html
Just as a warning, it's a LONG thread, and it's not answered yet, but it contains a lot of info on troubleshooting this sort of problem that I think may be useful to you!
Avatar of mrmut

ASKER

Thank you a lot!
Avatar of mrmut

ASKER

OK, I have some new info:

Ive captured one executable periodically consuming 100% of CPU: w3wp.exe.

It seems that some system service, sql and w3wp work in conjunction in a way, but I don't know what should I do now when I located the potential problem.

I would shut the bastards off this instance, but that probably wouldn't be such a smart thing to do.

Any hints?

Sorry to be brief, but take a look through these -

https://www.experts-exchange.com/questions/23108115/problem-with-ASp-net-ISS-w3wp-exe-using-100-CPU.html

http://forums.asp.net/p/917713/1113492.aspx

http://www.webhostingtalk.com/showthread.php?t=567795

That'll get you started, I'll keep searching for more relevant information, but unfortunately I now also have work to do! :)

Pete
Unfortunately though, if you google "w3wp.exe consuming 100% cpu" there are MANY links... All the same sort of problems, in different environments etc... You may need to trawl through these links to find the one that's closest to your environment before you find anything you can use!

However that first link I posted (from the EE website) could be useful?? Depends how involved you are with SQL... :)

Pete
Avatar of mrmut

ASKER

Hi Pete, and thank you for your help! It is much appreciated, I suppose that you are also aware of the impossibility of explaining that some problems need time.

However, it seems that I have tracked down the bad processes!

I have used Process explorer and watched constantly over CPU utilization graph. Then I have suspended the most likely candidates - and VOILA!

CPU utilization is bas to normal, no periodic spikes.

The screenshot is included, suspended stuff are grayed out, namely:

1. svchost.exe with two w3wp.exe instances
2. sqlservr.exe (one of three)


Is there any way to learn what a service does?

Regarding all the stuff you gave links for - I've read them, but I didn't get it. I know that Google does return myriads of w3wp.exe results, but it would be much nicer if there are only a few, and one of them possible solution.

Thank you a lot!
proces-explorer.jpg
Avatar of mrmut

ASKER

OK - now I've resumed all the processes to see what will happen.

Still waiting for a burst, feeling like a fisherman.
Avatar of mrmut

ASKER

Got this from: http://groups.google.com/group/microsoft.public.windows.server.sbs/browse_thread/thread/968c88da75b4480d#

Here's how I would start with this:

- make sure that your AV exclusions are configured correctly for SQL and
Exchange (those w3wp's are IIS application pools, which is the reason I
mention Exchange).  I tend to blame AV for a lot of otherwise unexplained
problems, and sorry to say I'm often right.

- open Task Manager, go to the Processes tab, and add PID to the view if
necessary.  Get the PID for the SQL instance that's acting up.  Open a cmd
prompt and type "tasklist /svc."  Find the PID in question, and it'll tell
you which SQL app it is.  At least that will narrow down your
troubleshooting somewhat.

- get Process Explorer and run it on the server.  Look under System to see
specifically what is using the CPU cycles.  Again that'll help narrow things
down, and you might get some additional information about the SQL thing
there as well.

Process Explorer is free, and there's no installation - you can just run the
executable.  I run it directly from the USB key that contains all my server
tools.
http://technet.microsoft.com/en-us/sysinternals/cb56073f-62a3-4ed8-9d...
Well I'm glad we're getting there! Process Monitor is a very useful tool, I've used it (and it's predecessors, reg mon and file mon) many many times! Always helps to be narrowing down where the problem is coming from...

With regards to "Is there any way to learn what a service does?" - Have you tried PROCESS MONITOR? I'm not sure if it's the same as Process Explorer, as it looks quite different... (another sysinternals tool).

This is a little more complicated to use I think, but tells you exactly what each process is doing and whether it succeeded or failed... See the screenshot for an example.

Bare in mind that this program is constantly capturing events (until you stop it) and within seconds of starting the program it has well over 30,000 events captured, so the filters are crucial. The best way to use it is run it during the problem just for a 5 seconds or something, then stop it capturing, and then identify how to use the filters to only see the things you want to see.

Give that a go and let me know... :)

Pete



But if you're asking whether you can find out what a process is actually doing
ProcMon.jpg
Apologies, I got a little confused above in the first few comments etc... Process Explorer is what you've been using, Process Monitor is what I'm suggesting you try now... :) Ignore me babbling on about proc mon as if you'd been using it all along at the beginning of the post! :)
Avatar of mrmut

ASKER

Thank you,

   will go testing right now!

   Here is the link: http://technet.microsoft.com/en-us/sysinternals/bb896645.aspx?PHPSESSID=d926


Avatar of mrmut

ASKER

Here is the screenshot.

I have tracked down the problem to the Scheduled Task: Update Services Synchronization task

That spike that can be seen is every five minutes caused by this task. I have disabled the task and it is peace now.

Will see what will happen now.
Avatar of mrmut

ASKER

Sry, screenshot.
PID.JPG
Avatar of mrmut

ASKER

What I've just found:

http://blogs.mcbsys.com/mark/post/Setting-Up-ESET-NOD32-Antivirus-30-Business-Edition-on-SBS-2003.aspx

Apparently, NOD32 can cause a lot of problems, so I have removed the listed equivalents of the files listed in NOD32 to see what will happen now.

The exclude list:

D:\Mail Server\Exchsrvr\MDBDATA\*.*
C:\Program Files\Exchsrvr\Mtadata\*.*
C:\Program Files\Exchsrvr\MCB03.log\*.*
C:\Program Files\Exchsrvr\Mailroot\*.*
C:\Program Files\Exchsrvr\Mdbdata\*.*
C:\Program Files\Exchsrvr\Conndata\*.*
C:\Program Files\Exchsrvr\srsdata\*.*
C:\WINNT\system32\inetsrv\*.*
C:\WINNT\IIS Temporary Compressed Files\*.*
C:\WINNT\NTDS\*.*
C:\WINNT\sysvol\*.*
C:\WINNT\ntfrs\*.*
C:\WINNT\security\edb*.log
C:\WINNT\security\tmp.edb
C:\WINNT\Security\Database\secedit.sdb
C:\WINNT\system32\CertLog\*.* - added 3/13/2008 (Certificate Authority files)
C:\WINNT\system32\dhcp\*.*
C:\WINNT\system32\wins\*.*
C:\Program Files\Microsoft SQL Server\MSSQL$BKUPEXEC\Data\*.*
C:\Program Files\Microsoft SQL Server\MSSQL$SBSMONITORING\Data\*.*
C:\Program Files\Microsoft SQL Server\MSSQL$SHAREPOINT\Data\*.*
F:\MSSQL2000\MSSQL\Data\*.*
C:\WINNT\System32\ntmsdata\*.*
C:\Program Files\Microsoft Windows Small Business Server\Networking\POP3\Failed Mail\*.*
C:\Program Files\Microsoft Windows Small Business Server\Networking\POP3\Incoming Mail\*.*
C:\WINNT\SoftwareDistribution\DataStore\*.*
C:\pagefile.sys
C:\WINNT\system32\licstr.cpa
C:\WINNT\system32\lls\*.* - corrected 3/10/2008
G:\*.*
H:\*.*
Interesting... Let me know how it turns out, it certainly sounds like we're on the verge of having this sorted now anyways! :)

Pete
Avatar of mrmut

ASKER

OK - new news;

Now I still have periodic bursts of CPU activity, but they appear without order, and users can work in their program.

Also, - I don't get any w3wp.exes anymore, or at least I can't detect them to cause any inconvenience. I have noted that update services cumulatively create the most CPU activity, mainly in one of the SQL instances, but it seems that that doesn't create problems.

I use WSUS 3.0 SP1 and I have purged unneeded updates (it lasted all night tonight).

Today I have also restarted so we'll see what will happen now. I will leave it unattended for a day or two.

Will post any new finding.
Avatar of mrmut

ASKER

OK, a few days passed with new settings and it seems that set of actions I've made fixed the problem.

The CPU log is attached, server back to almost constant idling, with just an occasional system activities.


cpu-normalised.jpg
Avatar of mrmut

ASKER

Will wait for a few days more, and if all works, that's it. :)
Great news! But don't get too excited just yet... :) We all know what a let down it is when you really think you've fixed a very long winded and annoying problem, only to find that it rears it's ugly head again a week later!

Glad things SEEM to be working out now though!

Pete
Avatar of mrmut

ASKER

Thank you a lot Pete, although you haven't brought me a Holy Grail of Solution, I don't think I would do it without your help! :)

We'll see what will happen.

Will keep this post for a while with "PING!".
Avatar of mrmut

ASKER

PING!
Avatar of mrmut

ASKER

PING! - The problem is back.

The slowdown occurs here and then - not often as before, but it is here.

Just for thest, I have suspended all three sqlserv.exe instances and both w3wp.exe instances under syvhost.exe instance. Now it seems everything works, will try to debug further and will report what I've found.


suspended.jpg
Avatar of mrmut

ASKER

PING!
Avatar of mrmut

ASKER

Unfortunately, I must conclude that although overall helpful, I have not came to the solution and will have to reinstall server.
Apologies mrmut, but I'm utterly out of ideas anyway...

I hope the reinstall resolves the issue! :)

Pete
Avatar of mrmut

ASKER

Thanks Pete! :)
ASKER CERTIFIED SOLUTION
Avatar of mrmut
mrmut
Flag of Croatia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of mrmut

ASKER

Solution found, no need for deletion of the question.
Excellent news! This question should definitely be left here in case anyone else faces the same problem. It's the utterly annoying ones that you're dying to find already answered on these sites! :)

Well done!

Pete
Avatar of mrmut

ASKER

Solution has been found.

There probably are some other solutions to this problem, tho time would not allow to research all possibilities.