We help IT Professionals succeed at work.

Exhange Server 2003 performance issues

5,393 Views
Last Modified: 2008-08-06
symptoms:  Users have been getting tons of "outlook is trying to receive data from the exchange server".  In checking the performance monitor, pages/sec and %proc time shoots up between 80-100 .  I ran the exchange troubleshooting assistant and it reports:

Overview  
Items of severity Errors  
 
  Disk bottleneck found :  
 A potential performance issue was observed from the disk performance counters. One or more disks is exhibiting a performance bottleneck.
 
  Potential issue with RPC activity found :  
 A potential issue with the RPC activity for some MAPI operations was identified.
  Tell me more about this issue and how to resolve it.  
 
  Network interface performance issue found :  
 A performance issue was found with the network interface performance counters.
 
  Unusually high user activity detected :  
 RPC Operations per second rates indicate a user or users on this server are unusually active.
  Tell me more about this issue and how to resolve it.  
 
  Slow Local Security Authority Subsystem Service calls Area: Function Call log
 The Function Call log (FCL) shows some slow calls to LSASS.
  Tell me more about this issue and how to resolve it.  
 
Items of severity Warnings  
 
  The top 6 users account for 100% of the MAPI CPU usage on the server :  
 The top 6 users account for 100% of the MAPI CPU usage on the server.
  Tell me more about this issue and how to resolve it.  
 
Informational Messages  
 
  No issues found with location of Exchange data files :  
 No issues were found with the location of the Exchange data files, page file, TEMP directory or TMP directory, or with the amount of disk space on one or more drives.
 
  No Lightweight Directory Access Protocol (LDAP) performance issues identified :  
 No issue was found with the 'MSExchangeDSAccess Domain Controllers LDAP' performance counters.
 
  No processor or memory bottleneck found :  
 No processor or memory bottleneck was found on the server exchange.
 
  RPC activity distributed across many users :  
 The usage sample shows RPC activity is distributed across many users, rather than being caused by a single user.

I have also run the ExMon utility today all day (9am-4pm) and here are the results: (sorry i know they are hard to read its CSV (only output)
,Packets,Operations,CPU Time (ms),CPU (%),Avg. Server Latency (ms),Max. Server Latency (ms),Bytes In,Bytes Out,Client Versions,Client IP Addresses,Read Pages,PreRead Pages,Dirtied Pages,Log Bytes
,"29374","35044","132780","64.69%","10","11998","783962","634315600","11.8000.0 ","192.168.20.125 ","0","0","0","0"
,"34377","56840","13710","6.68%","0","1171","15866558","3547876","11.8000.0 ","192.168.20.103 ","0","0","0","0"
,"10965","32144","11775","5.74%","4","1046","2630751","24644847","11.8000.0 ","192.168.20.112 ","0","0","0","0"
,"14952","26379","9705","4.73%","2","984","743223","23591735","11.8000.0 ","192.168.20.133 ","0","0","0","0"
,"4148","7821","9345","4.55%","8","437","1660393","913717","11.8000.0 ","192.168.60.101 ","0","0","0","0"
,"2565","7333","6030","2.94%","4","859","6121983","7818026","11.8000.0 ","192.168.20.142 192.168.20.141 ","0","0","0","0"
,"8248","25737","3120","1.52%","0","406","1799508","10656224","11.8000.0 ","192.168.20.151 ","0","0","0","0"
,"2347","6400","2625","1.28%","5","843","273036","4015804","11.8000.0 ","192.168.20.111 ","0","0","0","0"
,"1910","7016","2085","1.02%","1","359","253654","519605","11.8000.0 ","192.168.20.108 ","0","0","0","0"
,"1256","3367","1530","0.75%","3","1796","6221415","2110106","11.8000.0 ","192.168.20.113 ","0","0","0","0"
,"2152","7867","1155","0.56%","1","828","211240","314517","11.8000.0 ","192.168.20.107 ","0","0","0","0"
,"3099","10745","1140","0.56%","0","406","208404","677557","11.8000.0 ","192.168.20.146 ","0","0","0","0"
,"2043","7222","1005","0.49%","1","218","304173","676437","11.8000.0 ","192.168.20.140 ","0","0","0","0"
,"961","3274","945","0.46%","1","171","159077","6521269","11.8000.0 ","192.168.20.149 ","0","0","0","0"
,"2074","7329","900","0.44%","0","78","956318","219452","11.8000.0 ","192.168.20.104 ","0","0","0","0"
,"922","2862","855","0.42%","2","234","3223659","1741194","11.8000.0 ","192.168.20.130 ","0","0","0","0"
,"831","3095","780","0.38%","2","203","92368","578719","11.8000.0 ","192.168.20.118 ","0","0","0","0"
,"623","2468","765","0.37%","3","109","172321","445318","11.8000.0 ","192.168.20.132 ","0","0","0","0"
,"623","2171","630","0.31%","4","437","316297","204120","11.8000.0 ","192.168.20.102 ","0","0","0","0"
,"900","3097","630","0.31%","2","124","898662","1283687","11.8000.0 ","192.168.20.145 ","0","0","0","0"
,"777","2884","435","0.21%","1","171","124192","834408","11.8000.0 ","192.168.20.106 ","0","0","0","0"
,"531","1640","360","0.18%","2","124","79113","3995442","11.8000.0 ","192.168.20.115 ","0","0","0","0"
,"388","1473","345","0.17%","2","140","24790","93269","11.8000.0 ","192.168.10.108 ","0","0","0","0"
,"229","898","315","0.15%","4","218","65791","60039","11.8000.0 ","192.168.60.108 ","0","0","0","0"
,"656","1720","300","0.15%","1","124","5620438","150226","11.8000.0 ","192.168.20.110 ","0","0","0","0"
,"270","928","285","0.14%","4","234","24654","510105","11.8000.0 ","192.168.20.124 ","0","0","0","0"
,"237","748","240","0.12%","3","140","137758","1208601","11.8000.0 ","192.168.20.119 ","0","0","0","0"
,"207","653","240","0.12%","3","109","114572","76407","11.8000.0 ","192.168.40.107 ","0","0","0","0"
,"212","953","225","0.11%","2","109","89427","607307","11.8000.0 ","192.168.20.109 ","0","0","0","0"
,"452","1082","210","0.10%","1","124","4899066","65925","11.8000.0 ","192.168.20.137 ","0","0","0","0"
,"829","1707","165","0.08%","0","124","43710","282035","11.8000.0 ","192.168.20.14 192.168.20.103 ","0","0","0","0"
,"210","765","135","0.07%","4","109","58878","30463","11.8000.0 ","192.168.20.135 ","0","0","0","0"
,"124","459","120","0.06%","2","46","47298","378671","11.8000.0 ","192.168.40.100 ","0","0","0","0"

Background of the server:
Dell Poweredge dual xeon 3ghz proc, 2gb ram,raid 5,50 mailboxes, 8 users are utilizing exchange active sync.  6 mailboxes are over 1gb.  I also use the exchange journaling feature (keep a year of email, that mailbox is huge (about 17 or 18gb).  the disks have two partitions: C and D.  C is the OS, all of exchange is on D (i know bad). Shared calendars are being used.  

 
Ok what I have done so far:

>implemented a 1gb mailbox size limit.  giving the users who are over a week to get cleaned up.

>followed all recommendations in here: http://support.microsoft.com/kb/815372
boot.ini changes (base video, /3gb, /userva),changed to standard vga driver.  all the reg entries were already correct. (i haven't rebooted with boot.ini changes yet.  tonight).

>increase page file size to be 1.5 physical ram (it is on D).  

>the disks are defragged weekly.  C is 15gb and has 10 free.  D is 120gb and has 35gb free

That's all i can remember that i have done--i have been working on all week so i am fried.  

Here is other things i am considering doing but want feedback:

>adding 2gb more ram

>adding another disk and putting the temp directory and pagefile on it.

I know the RAID 5 is not great for disk performance.  I know the logs and mailboxes should be on different drives--can i move them?

I know the hardware config is not optimal for exchange but with 50 mailboxes doesn't seem(?) like alot for the hardware.  


Comment
Watch Question


  Have  you tried to do an off line defrag of the databases ?  if not please try it... also adding a drive to the raid 5 and moving the page file will always help with system performance.... but exchange is a memory user so adding ram wil help to a point....  First check to see how much ram is being used via task manager...  also check you antivirus i had an issue where Symantec corp was using allot is processing power on the message in and out directories ......  so i excluded the directories for scanning....  also what else are you running on the server ?  or is this an exchange stand alone sever ?
CERTIFIED EXPERT
Commented:
Unlock this solution and get a sample of our free trial.
(No credit card required)
UNLOCK SOLUTION
David Scott, MCSENetwork Administrator

Author

Commented:
redseat:  how would i move the transaction logs to the new raid 1 array?  

they are accessing via rpc/http (mostly cached mode excluding the exchange active sync users as i understand that can be problematic).

vtob: i use mcafee virus scan enterprise and mcafee group shield for exchange.  i have the mailroot and mdbdata directories excluded from scanning.

i was running backup exec continuous protection server on this server, but have stopped the jobs to see if performance improves.  this server is also one of two domain controllers (i know not recommended for exchange)

David Scott, MCSENetwork Administrator

Author

Commented:
i just reran the performance analyzer (which put the boot.ini changes /3gb /userva /basevideo into effect).  
i only rec'd the network interface outbound packets beyond threshold error and
the slow lsass calls.  

for network interface error, MS says its a faulty nic.  i've got dual nics on the server and i already changed to the other nic and still rec this error so i doubt both nics on the server are faulty?

for the slow lsass calls, one thing it recommends is the "never ping" reg key set to 1 on domain controllers which i have already done.

I don't know if i didn't get the bottleneck issues b/c of the boot.ini changes or b/c all the sales people were in a meeting while it was running ;)
David Scott, MCSENetwork Administrator

Author

Commented:
i just ran another troubleshooting assistant performance test:

I am getting this error about unusually high rpc activity.  I mean i only have 50 mailboxes. that doesn't seem like it should create "unusually high rpc activity"...is it possible its the exchange active sync?  

the 15gb journal mailbox-could that be contributing to performance issues? i could look into a third party app for email archiving.....or setup the journal mailbox in outlook and set the autoarchive and save the archive on a file server.  

David Scott, MCSENetwork Administrator

Author

Commented:
i forgot to post this from the report:

RPC Performance Counter Data (Performance_RPCPerfCounters1.1.1.1.1.1.1.1.1.1.1.1.3.1.1.7)
   RPC Performance Counters
   10/05/2007 12:34:50 - 10/05/2007 12:39:44
  The ratio of active logons per mailbox on the server is 3.75.
  The user activity level is 0.353 operations per sec per user.  
   Since the RPC operations per second per user is greater than 0.25, the RPC operations per second rates indicate a user or users on this server are unusually active. The measured RPC operations per second per user rate is 0.353.  
  RPC health during the time range: 10/05/2007 12:34:50 - 10/05/2007 12:39:44
  Summary of RPC results
   If the users accessing the Exchange server are highly active, and you are unable to reduce the load on your server, and your server is exhibiting bottlenecks, you should consider moving some users to another server.
   RPC Operations per second rates indicate a user or users on this server are unusually active.
Commented:
Unlock this solution and get a sample of our free trial.
(No credit card required)
UNLOCK SOLUTION
David Scott, MCSENetwork Administrator

Author

Commented:
actually i found the link for the move of the logs.  i will try another nic card.  

i have had very few outlook errors today.  maybe my efforts so far have helped.  or maybe its just friday before a 3 day weekend!! columbus day!!!

i suppose I could try the active sync thing but it would be tough to sell as the sales people are hooked on getting their email on their phones now.  

thanks for the input.  
CERTIFIED EXPERT
Commented:
Unlock this solution and get a sample of our free trial.
(No credit card required)
UNLOCK SOLUTION
David Scott, MCSENetwork Administrator

Author

Commented:
no, i don't have a new raid 1 array.  haven't decided to that yet.  wanted to eliminate the possibility of bad nic.

i have run exmon and i posted the results above.  while there are few "power users" with high usage i am not sure one of them is responsible for the slow downs.  take a look above if you could and tell me what you think.

i will look into the nic teaming.  thanks
David Scott, MCSENetwork Administrator

Author

Commented:
ran diags on both nics.  both passed.  enabled nic teaming.  ran exchange troubleshooter again.  here is the report:  looks like disk issues and alot of synchronizations (from exchange active sync and/or outlook cached mode?)

_____________________________________________________________

Performance Issues  
Area: Disk Drive and Exchange Data File Information  
 
Time Range: All  
 
  The transaction log files for storage group 'First Storage Group' do not have a dedicated drive Time Range: All
 The transaction log files for storage group 'First Storage Group' share drive D: with d:\program files\exchsrvr\mailroot\vsi 1\queue, d:\program files\exchsrvr\mdbdata\priv1.edb, d:\program files\exchsrvr\mdbdata\priv1.stm, d:\program files\exchsrvr\mdbdata\pub1.edb, d:\program files\exchsrvr\mdbdata\pub1.stm.
  Tell me more about this issue and how to resolve it.  
 
  SMTP server does not have a dedicated drive Time Range: All
 The queues for SMTP server 'Default SMTP Virtual Server' share drive D: with the following Exchange data files: D:\PROGRAM FILES\EXCHSRVR\MDBDATA\PRIV1.EDB, D:\PROGRAM FILES\EXCHSRVR\MDBDATA\PRIV1.STM, D:\PROGRAM FILES\EXCHSRVR\MDBDATA\PUB1.EDB, D:\PROGRAM FILES\EXCHSRVR\MDBDATA\PUB1.STM.
  Tell me more about this setting.  
 
Area: Disk Drive Health  
 
Time Range: 10/12/2007 14:59:59 - 10/12/2007 15:04:53  
 
  Logical disk performance issue on drive hosting SMTP server: Average write latency Time Range: 10/12/2007 14:59:59 - 10/12/2007 15:04:53
 SMTP drive: Average '\LogicalDisk(D:)\Avg. Disk sec/Write' should be less than 10 (0.01 ms). The measured value is 0.016 (16 ms).
  Tell me more about this issue and how to resolve it.  
 
  Logical disk performance issue on drive hosting SMTP server: Maximum write latency Time Range: 10/12/2007 14:59:59 - 10/12/2007 15:04:53
 SMTP drive: Maximum '\LogicalDisk(D:)\Avg. Disk sec/Write' should be less than 50 (0.05 ms). The measured maximum value is 0.064 (64 ms).
  Tell me more about this issue and how to resolve it.  
 
  Performance issue found on logical disk containing system page file Time Range: 10/12/2007 14:59:59 - 10/12/2007 15:04:53
 Page file drive: The average value for '\LogicalDisk(D:)\Avg. Disk sec/Write' should be less than 0.01 (10 ms). The measured value is 0.016 (16 ms).
  Tell me more about this issue and how to resolve it.  
 
  TEMP drive write latencies are high Time Range: 10/12/2007 14:59:59 - 10/12/2007 15:04:53
 TEMP drive: The average value for '\LogicalDisk(D:)\Avg. Disk sec/Write' should be less than 0.01 (10 ms). The measured value is 0.016 (16 ms).
  Tell me more about this issue and how to resolve it.  
 
  TEMP drive write latencies are high Time Range: 10/12/2007 14:59:59 - 10/12/2007 15:04:53
 TEMP drive: The maximum value for '\LogicalDisk(D:)\Avg. Disk sec/Write' should be less than 0.05 (50 ms). The measured value is 0.064 (64 ms).
  Tell me more about this issue and how to resolve it.  
 
  TMP drive write latencies are high Time Range: 10/12/2007 14:59:59 - 10/12/2007 15:04:53
 TMP drive: The average value of '\LogicalDisk(D:)\Avg. Disk sec/Write' should be less than 0.01 (10 ms). The measured value is 0.016 (16 ms).
  Tell me more about this issue and how to resolve it.  
 
  TMP drive write latencies are high Time Range: 10/12/2007 14:59:59 - 10/12/2007 15:04:53
 TMP drive: The maximum '\LogicalDisk(D:)\Avg. Disk sec/Write' should be less than 0.05 (50 ms) for the TMP drive. The measured value is 0.064 (64 ms).
  Tell me more about this issue and how to resolve it.  
 
  Transaction log write latencies are high Time Range: 10/12/2007 14:59:59 - 10/12/2007 15:04:53
 Transaction log disk: The average value for '\LogicalDisk(D:)\Avg. Disk sec/Write' should be less than 0.01 (10 ms). The measured value is 0.016 (16 ms).
  Tell me more about this issue and how to resolve it.  
 
  Transaction log write latencies are high Time Range: 10/12/2007 14:59:59 - 10/12/2007 15:04:53
 Transaction log disk: The maximum value for '\LogicalDisk(D:)\Avg. Disk sec/Write' should be less than 0.05 (50 ms). The measured value is 0.064 (64 ms).
  Tell me more about this issue and how to resolve it.  
 
Area: Network Usage  
 
Time Range: 10/12/2007 14:59:59 - 10/12/2007 15:04:53  
 
  Average 'Network Interface(Intel[R] Advanced Network Services Virtual Adapter)\Packets Outbound Errors' beyond error threshold Time Range: 10/12/2007 14:59:59 - 10/12/2007 15:04:53
 The average 'Network Interface(Intel[R] Advanced Network Services Virtual Adapter)\Packets Outbound Errors' is greater than 0 packets. The measured value is 1 packets.
  Tell me more about this issue and how to resolve it.  
 
Area: RPC Performance Counters  
 
Time Range: 10/12/2007 14:59:59 - 10/12/2007 15:04:53  
 
  Active RPC user activity Time Range: 10/12/2007 14:59:59 - 10/12/2007 15:04:53
 Since the RPC operations per second per user is greater than 0.15, the users are considered as 'moderately active'. The measured RPC operations per second/per user rate is 0.247.
  Tell me more about this issue and how to resolve it.  
 
Unclassified Items:  
 
MAPI Operation: FXSrcGetBuffer  
 
  High RPC synchronization operations MAPI Operation: FXSrcGetBuffer
 The Exchange Server User Monitor (ExMon) RPC data indicates that clients are performing synchronization operations. Synchronization operations for MAPI operation "FXSrcGetBuffer" account for 66.67% of the processor usage devoted to processing RPC requests.
  Tell me more about this issue and how to resolve it.  
 
David Scott, MCSENetwork Administrator

Author

Commented:
Are the high sync operations normal considering the exchange active sync?  

I believe my next step is to get the two disks and setup a raid 1 then move the transaction logs to that drive?
Commented:
Unlock this solution and get a sample of our free trial.
(No credit card required)
UNLOCK SOLUTION
David Scott, MCSENetwork Administrator

Author

Commented:
i did the above (disabled scalable networking) and i followed this ms article:

http://support.microsoft.com/?kbid=818484

did first one set it to 5

i still have a disk bottleneck, Network Outbound Packets beyond threshold and high user rpc activity.  

i have run exmon for a period of time and there are a couple of heavy email users who also have windows mobile devices syncing with exchange.

i still have 4 or mailboxes over a gig (barely).  

and i have the journal mailbox (with envelope journaling) which is a monster 17.5GB

we also have a staff calendar which gets alot of usage.  

i don't know its probably just all those things combined are causing the issues------but users aren't getting the RPC errors on their outlook clients anymore.  But the performance analyzer is still giving those three errors.

i think i might get 2gb more ram and add the two disks in a raid 1 and put the tmp/temp and the logs on them.  see what happens.

David Scott, MCSENetwork Administrator

Author

Commented:
one thing technet said for the outbound packet issue is this:

"Segment inter-server and global catalog traffic
When there is much traffic, and therefore overhead due to packet collision, you can improve network performance by separating inter-server and global catalog traffic from client traffic. You can do this by having servers and global catalogs with dual network adapters, and by building a separate network for the communication required by servers and global catalogs."


i'm not sure how to go about doing this?
David Scott, MCSENetwork Administrator

Author

Commented:
i'm just getting exhausted with this.  maybe i will stop journaling for a minute and run the perf analyzer and see if that makes any difference.
David Scott, MCSENetwork Administrator

Author

Commented:
i followed the ms article already and i'm still getting errors in the log that the server memory settings are not optimal for exchange

Event Type:      Warning
Event Source:      MSExchangeIS
Event Category:      General
Event ID:      9665
Date:            10/15/2007
Time:            4:06:04 PM
User:            N/A
Computer:      EXCHANGE
Description:
The memory settings for this server are not optimal for Exchange.

 For more information, click http://support.microsoft.com?kbid=815372

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 03 00 00 00               ....    
David Scott, MCSENetwork Administrator

Author

Commented:
Sorry so long posting back.  Crazy time of year around here.  I put the two scsi disks in and created a raid 1 array, moved the log files there and also put the temp folders on it.  I ran the perf analyzer and no disk bottlenecks but still the network interface issue as described in this link:

http://technet.microsoft.com/en-us/library/aa997363.aspx

maybe i will switch out both nic cards?  

David Scott, MCSENetwork Administrator

Author

Commented:
Ok, UNCLE.  

I ran the troubleshooting assistant again during production hours and it still shows a disk bottleneck.  

so i tried moving the database file (not streaming) to the raid 1 as well and now the bottleneck is on the raid 1 array.  I am going to move the database file back to the raid 5 and leave the raid 1 with the log files.  

No one is complaining about issues and the rpc popups are no longer happening.  

I also added 2 more GB of additional RAM.

I am also still getting the network interface issue---i called Dell and they stated that the diagnostics builtin to the driver of the nics is accurate.  I ran that before and the cards passed all tests.  

I suppose since email performance seems good, I am not going to worry about the disk bottlenecks.  
Unlock the solution to this question.
Thanks for using Experts Exchange.

Please provide your email to receive a sample view!

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.