Question

Interpreting performance counters in Windows (XP/2000)

Asked by: TTCore

Hi everyone :)

I am responsible for developing a real time software application (running on windows, primarily XP) which appears to be having performance issues and I am trying to find out how to interpret certain MS performance counters but Ive searched all over the internet, and here, to no avail and i was hoping someone could point me in the right direction.  I apologise if this question has been answered before or if the answers are obvious, I did make sure i spent a fair amount of time searching first :)

What i would like to know is how to fully interpret the following counters on both single core and dual/multi core boxes.  I'd like to know what acceptable values and unacceptable values are.  Do I need to combine other counters to interpret them properly?  Are there certain patterns that indicate a problem rather than just a single value? etc.

Processor Queue Length
Context switches/sec
Avg.Disk Queue Length

For context, I am already looking at and understanding the following counters ok:

% processor time for each core, including privileged and user, Available MBytes, and Pages/sec

If anyone has any suggestions for additional counters i should be looking at for measuring performance that would also be great :)

Thank you,

Matt :)

This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.

Subscribe now for full access to Experts Exchange and get

Instant Access to this Solution

  • Plus...
  • 30 Day FREE access, no risk, no obligation
  • Collaborate with the world's top tech experts
  • Unlimited access to our exclusive solution database
  • Never be left without tech help again

Subscribe Now

Asked On
2007-06-18 at 04:05:29ID22640198
Tags

windows

,

performance

,

queue

Topics

Measurement Industry

,

Windows XP Operating System

,

Miscellaneous Hardware

Participating Experts
1
Points
500
Comments
11

Trusted by hundreds of thousands everyday for fast, accurate and reliable tech support.

  • "The time we save is the biggest benefit of Experts Exchange to Warner Bros. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange." Mike Kapnisakis, Warner Bros.
  • "Our team likes having a resource that is more secure than just using Google and most experts using this service really know their stuff. It's nice to look here first versus using Google." Dayna Sellner, Lockheed Martin
  • "Anytime that I've been stumped with a problem, 9 out of 10 times Experts Exchange has either the accepted solution or an open discussion of the potential solution to the problem." Kenny Red, eBay Inc.

See what Experts Exchange can do for you.

Got a question?

We've got the answer.

Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.

Screenshot of Experts Exchange Knowledgebase

Need individual assistance?

Our experts are ready to help.

If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.

Screenshot of Experts Exchange Knowledgebase

Want to learn from the best?

Read articles from industry experts.

Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.

Screenshot of an Article

Working on a long term project?

Store your work and research.

Save solutions to your questions, answers you’ve discovered through searching plus helpful articles in your personal knowledgebase for easy future access.

Screenshot of Experts Exchange Knowledgebase

Access the answers to your technology questions today.

Subscribe Now

30-day free trial. Register in 60 seconds.

What Makes Experts Exchange Unique?

Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Trusted by the world's most respected brands.

image of each brand's logo

Faithfully serving IT professionals since 1996.

Experts Exchange Logo

Try it out and discover for yourself.

Subscribe Now

30-day free trial. Register in 60 seconds.

Related Solutions

  1. interpreting frame relay statistics
    At the crux of my situation is the following: I am able to collect just about any statistic on our current frame relay network but can not find a source anywhere that helps me analyze/interpret the statistics. Background: The current frame relay network consists of a centra...
  2. FoxPro 2.6 way too slow under Windows XP/2000
    Hi all, we're facing a strange problem with some of our Windows XP/2000 boxes. On some of those computers, a FoxPro 2.6 Application (Windows, foxprow.exe) is very slow, whereas on other 2000/XP boxes, the program just runs fine. Visual FoxPro is fine, but given the timefra...
  3. How to use CLI command with WIN NT/XP/2000 ???
    How to use CLI command with WIN NT/XP/2000 ??? It gives "priviledge" error message.
  4. Measure bandwidth/traffic
    How can I measure bandwidth for a network? I'm familiar with ethereal and a couple other network sniffers but after sampling the traffice I have no idea if it's high, low, is there a limit, etc.. I sample the results and save them so that I can compare with problems in the ...
  5. Interpreting log4j pattern strings
    log4j: What do I interpret the pattern string with %, [, - etc? Is there any tutorial for just PatternLayout?

Free Tech Articles

  1. WARNING: 5 Reasons why you should NEVER fix a computer for free.
    It is in our nature to love the puzzle. We are obsessed. The lot of us. We love puzzles. We love the challenge. We thrive on finding the answer. We hate disarray. It bothers us deep in our soul. W...
  2. SCCM OSD Basic troubleshooting
    SCCM 2007 OSD is a fantastic way to deploy operating systems, however, like most things SCCM issues can sometimes be difficult to resolve due to the sheer volume of logs to sift through and the dispe...
  3. Migrate Small Business Server 2003 to Exchange 2010 and Windows 2008 R2
    This guide is intended to provide step by step instructions on how to migrate from Small Business Server 2003 to Windows 2008 R2 with Exchange 2010. For this migration to work you will need the fo...
  4. Create a Win7 Gadget
    This article shows you how to create a simple "Gadget" -- a sort of mini-application supported by Windows 7 and Vista. Gadgets can be dropped anywhere on the desktop to provide instant information, ...
  5. Outlook continually prompting for username and password
    There have been a lot of questions recently regarding Outlook prompting for a username and password whilst using Exchange 2007. There are a few reasons why this would happen and I will try to cover t...
  6. Backup Exchange 2010 Information Store using Windows Backup
    There seems to be quite a lot of confusion around the ability to backup Exchange 2010 using the built in Windows Backup feature. This stems from the omission of this feature prior to Exchange 2007 s...

Cloud Class Webinars

  1. Avoiding Bugs in Microsoft Access
    Alison Balter takes and in-depth look at avoiding bugs in Access. In this webinar you will learn about using the immediate window to debug your applications, invoking the debugger, using breakpoints to troubleshoot, stepping through code, setting the next statement to execute, ...
  2. Top 10 Best New Features in Visio 2010
    Scott Helmers gives live demonstrations of the top 10 new features in Visio 2010. This webinar will teach you how to create compelling diagrams by adding shapes to the page with a single click, linking the shapes in a diagram to data in Excel (or SQL Server, or SharePoint), ...
  3. IT Consultant Business Secrets Revealed
    Michael Munger, Experts Exchange tech pro and IT consultant, pulls back the curtain on his very successful businesses and answers question on every IT consultant and business owner should know about. He shares secrets on what he did to solve the 5 most common problems in IT, ...
  4. Disaster Recovery and Business Continuity
    Quest CTO, Mike Billon, gives an overview of the steps involved in building a dunamic disaster recovery plan. Through case studies and an examination of software/hardware tooles for monitoring and testing, you'll gain a better understandin of where you are, where you want ...
  5. Organize Your Visio Diagrams with Containers and Lists
    Scott Helmers uses cross functional flowcharts, wireframe diagrams, data graphic legends and seating charts to teach you: how to ustilize all three new structured diagram components in Visio 2010, the best practices for organizeing shapes in previous version of Visio, how to organize ...
  6. How to Us Objects, Properties, Events and Methods in Microsoft Access
    Alison Dalter gives an in-depbth look at objects, properties, events and methods in Microsoft Access. In this webinar you will learn about using the object browser, referring to objects, working with properties and methods, working with object variables, understanding the ...

Join the Community

Give a Little. Get a Lot.

Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.

Join the Community

Answers

 

by: GATOR420Posted on 2007-06-18 at 04:43:00ID: 19306125

Processor queue length is how many tasks the processor has lined up waiting to be processed. Generally if this is a high number (>5 is the benchmark we typically use depending on the size of the box and what it does. We do have an Oracle cluster server where we have our thresholds set to >25.) your CPU is being heavily tasked and can't process all the tasks being thrown at it. This would be an indicator that the workload is too high and you need to spread out the load or upgrade.

Avg. disk queue length is the same as above but this is for read/writes to the disk. A possible indicator of disk I/O problems. For example if you have one large capacity disk and you are seeing a high disk queue length then you shoould probably consider spreading the I/O across several smaller disks.

Context switches/sec I had to go look this one up, we don't monitor it and I had no idea what it did. Pasted from: http://www.microsoft.com/technet/prodtechnol/windows2000serv/reskit/core/fned_ana_pftl.mspx?mfr=true

A context switch occurs when the kernel switches the processor from one thread to anotherfor example, when a thread with a higher priority than the running thread becomes ready. Context switching activity is important for several reasons. A program that monopolizes the processor lowers the rate of context switches because it does not allow much processor time for the other processes' threads. A high rate of context switching means that the processor is being shared repeatedlyfor example, by many threads of equal priority. A high context-switch rate often indicates that there are too many threads competing for the processors on the system.


Note

The rate of context switches can also affect performance of multiprocessor computers. For information about how to monitor and tune context-switch activity on multiprocessor systems, see "Measuring Multiprocessor System Activity" in this book.

You can view context switch data in two ways:"      
The System\Context Switches/sec counter in System Monitor reports systemwide context switches.
"      
The Thread(_Total)\Context Switches/sec counter reports the total number of context switches generated per second by all threads.


Although these counters might vary slightly due to sampling, generally they will be nearly equal.

Figure 7.5 plots System\Context Switches/sec during a transient bottleneck.


Figure 7.5 Systemwide Context Switches During a Processor Bottleneck

In Figure 7.5, Processor(_Total)\% Processor Time jumps to about 60 percent during the sample interval. System\Processor Queue Length (scaled by a factor of 10), shows that the queue varies from 2 to 6, with a mean near 4. System\Context Switches (shown scaled by a factor of 10), reveals an average of about 750 switches per second. A rate of context switches from 500 to 2,000 per second might indicate a problem with a network adapter or a device driver or that you are using an inefficient server-based application that spawns too many threads.

The Pviewer utility on the on the Windows 2000 operating system CDreports context switch data.For information about installing and using the Windows 2000 Support Tools and Support Tools Help, see the file Sreadme.doc in the \Support\Tools folder of the Windows 2000 operating system CD.

 

by: GATOR420Posted on 2007-06-18 at 04:51:23ID: 19306166

Going further a bit you should monitor your systems with these counters and set your own benchmarks/thresholds for acceptable performance after you have monitored them for a while. Try to capture instances where you have noticeable performance degradation and that should give you a good idea of what to look for. Because each system performs differently its hard to do a cookie cutter approach to monitoring.

 

by: TTCorePosted on 2007-06-18 at 06:13:25ID: 19306649

Thanks for the quick response Gator :)

What i now realise I should have said is that i understand the meaning of the counters i'm interested in, i just don't know what acceptable values are and therefore how to interpret the statistics i have got.  The link you give has got some good reading linked from it though so i'll go through it and see I can learn anything there, thank you.

GATOR420:
[Going further a bit you should monitor your systems with these counters and set your own benchmarks/thresholds for acceptable performance after you have monitored them for a while. Try to capture instances where you have noticeable performance degradation and that should give you a good idea of what to look for. Because each system performs differently its hard to do a cookie cutter approach to monitoring.]

This is very good advice and ultimately may be the only true way to know the answers to my questions although its not an easy process to perform.  I was hoping/expecting there would also be some generalised ranges of values for each counter, particularly the queue lengths (Disk and CPU).  

My app is a real time trading application so there is no tolerance for delays(theoretically) so i'm really looking for examples of values expected during the "normal" functioning of the PC where neither cpu usage nor HD access are a bottleneck within the system.  As soon as they start being a bottleneck I need to know, and i then need to know some kind of scale as to what constitutes the resource being a little overworked and therefore contributing slightly, to what constitunes a resource being majorly overworked and so contributing greatly to a slowing of the system.

 

by: GATOR420Posted on 2007-06-18 at 06:51:02ID: 19306998

Ideally if you have sustained values over 0 for CPU and disk queue that is a definite indicator. Sustained is the key word. If you have peaks where it goes above 0 and drops back down that just indicates that it received a burst of requests, processed them, and is back to "normal." It also varies on multi-processor systems where if the application is not multi-threaded, it could be tying up one CPU where the other is completely idle.  So you really have to understand your system and what you are running on it to determine a set of values to benchmark by.

 

by: TTCorePosted on 2007-06-18 at 08:51:13ID: 19308148

OK cool.  So, getting more specific, in the case of the stats i am looking at, my processor queue length idling at 4 minimum for 15 minutes in the run up to my suspected incident whilst hitting levels of 10 a number of times and 16 twice would be a clear indication of CPU resources being overused, which coupled with the 80% cpu usage i have seen would make sense.

I do appreciate what you are saying about multi-core systems and mult-ithreaded applications having a different affect on performance monitoring although if the queue is backing up and at least one core is running at a high usage level then you obviously need more more cycles to do the work you are trying to do.  What you then choose to do to solve that problem (i.e. Re-write your code to use more threads or increase the speed of the indiviudal cores) then depends the architecture of your hardware and software, no?

---

My Avg Disk Queue Length values are in the scale of 1/100ths, e.g.

0.000542649
0.001981151
0.000570173
0.000463098
0.00062454
0.000509863

so what is an acceptable value here?

Thanks again for the help.  It's good to have someone to discuss this stuff with :)

 

by: GATOR420Posted on 2007-06-18 at 09:37:43ID: 19308555

In regard to your first question about CPU queue, that is definitely the case.

As far as disk queue goes, that one is a bit harder to interpret without also looking at current disk queue. Monitor both your avg. disk queue lengths and current queue lengths and post the results here.

 

by: TTCorePosted on 2007-06-20 at 07:22:47ID: 19324827

Hi gator :)  

Sorry for the delay.  I've got a lot on at work atm with this only being one of the paths i'm following.  I've requested the current disk queue lengths be added as well and i'll see what comes back.  I don't know what is normal with timescales for questions on the site so i may not get the stats back before i close the topic and accept the solution.

Matt :)

 

by: GATOR420Posted on 2007-06-20 at 08:43:43ID: 19325639

No worries.

Here is some basic information about the disk queue monitor and why I also requested to look at current and not just average. This is from: http://www.oreillynet.com/pub/a/network/2002/01/18/diskperf.html

The Avg. Disk Queue Length counter is derived from the product of Avg. Disk sec/Transfer multiplied by Disk Transfers/sec, which is the average response of the device times the I/O rate. Again, this corresponds to a well-known theorem of Queuing Theory called Little's Law, which states:
N = A * Sr

where N is the number of outstanding requests in the system, A is the arrival rate of requests, and Sr is the response time. So the Avg. Disk Queue Length counter is an estimate of the number of outstanding requests to the (Logical or Physical) disk. This includes any requests that are currently in service at the device, plus any requests that are waiting for service. If requests are currently waiting for the device inside the SCSI device driver layer of software below the diskperf filter driver, the Current Disk Queue Length counter will have a value greater than 0. If requests are queued in the hardware, which is usual for SCSI disks and RAID controllers, the Current Disk Queue Length counter will show a value of 0, even though requests are queued.

Since the Avg. Disk Queue Length counter value is a derived value and not a direct measurement, you do need to be careful how you interpret it. Little's Law is a very general result that is often used in the field of computer measurement to derive a third result when the other two values are measured directly. However, Little's Law does require an equilibrium assumption in order for it be valid. The equilibrium assumption is that the arrival rate equals the completion rate over the measurement interval. Otherwise, the calculation is meaningless. In practice, this means you should ignore the Ave Disk Queue Length counter value for any interval where the Current Disk Queue Length counter is not equal to the value of Current Disk Queue Length for the previous measurement interval.

Suppose, for example, the Avg. Disk Queue Length counter reads 10.3, and the Current Disk Queue Length counter shows four requests in the disk queue at the end of the measurement interval. If the previous value of Current Disk Queue Length was 0, the equilibrium assumption necessary for Little's Law does not hold. Since the number of arrivals is evidently greater than the number of completions during the interval, there is no valid interpretation for the value in the Avg. Disk Queue Length counter, and you should ignore the counter value. However, if both the present measurement of the Current Disk Queue Length counter and the previous value are equal, then it is safe to interpret the Avg. Disk Queue Length counter as the average number of outstanding I/O requests to the disk over the interval, including both requests currently in service and requests queued for service.

You also need to understand the ramifications of having a total disk roundtrip time measurement instead of a simple disk service time measure. Assuming M/M/1, a disk at 50 percent busy has one request waiting on average and disk response time is 2 times service time. This means that at 50 percent busy--assuming M/M/1 holds--an Avg. Disk Queue Length value of 1.00 is expected. That means that any disk with an Avg. Disk Queue Length value greater than 0.70 probably has a substantial amount of queue time associated with it. The exception, of course, is when M/M/1 does not hold, such as during a back-up operation when there is only a single user of the disk. A single user of the disk can drive a disk to nearly 100 percent utilization without a queue!

 

by: TTCorePosted on 2007-06-25 at 04:31:20ID: 19354743

Hi Gator :)

I'm too busy at the moment to carry this thread on in a fair emount of time and in fact have a load of other questions that have now become more pressing so i will accept your solutions for now and hopefully we will get the opportunity to continue this discussion another time.

Thank you for all you help :)

Matt :)

 

by: TTCorePosted on 2007-06-25 at 04:31:50ID: 19354744

p.s. to any moderators passing by.  I think i'm doing the right thing here, please let me know if i'm not :)

 

by: GATOR420Posted on 2007-06-25 at 07:18:07ID: 19355784

No problem, thanks for the points. Just reply here later if you need more info and we can discuss.

20120131-EE-VQP-002

3 Ways to Join

30-Day Free Trial

The Experts

98% positive feedback on 31,087 answers since March 2000. angeliii is a Microsoft Most Valuable Professional for his work with MS SQL Server & Develoment.

He has also proven his knowledge of Visual Basic Programming, PHP Scripting and Oracle Databases.

The Experts

97% positive feedback on 10,752 answers since July 2000. lrmoore has more than 18 years experience in the networking industry.

The six-time Mircosoft MVPs specialties include firewalls, virtual private networking, and network management.

Testimonials

"...and excellent source for support... Kind of like having your very own IT dept." Electriciansnet

Testimonials

"I was apprehensive at signing up at first. However... it has already made my life as an IT administrator much easier." JaCrews

Testimonials

"WOW! You guys have great, active, and knowledgeable people on here." moore50

Business Clients

Business Clients

In the Press

"If you’ve got a question... Experts Exchange can supply an answer.”

In the Press

"...an invaluable aid for both IT professionals and those who require tech support."

In the Press

"where IT professionals provide quick answers on just about any topic"

Business Account Plans

Loading Advertisement...