Disk Read Bytes per sec, IOPS, and Disk reads per sec ?????

Posted on 2009-12-21
Last Modified: 2013-11-14
We are int he process of determining if we need a new san or not.  A tool was provided to me by netapp to measure the performance of our disk system.

Currently we have an Dell EMC AX100.

The specifications state that it can do

30,000 I/Os and 150MB per second w/ 11 7200 rpm sata drives

OK, so I ran this tool from netapp. It produces some graphs that show

X1 = Disk reads Per Sec
X2 = Disk Read bytes Per Sec
Y = Time

Their graph shows it topping out at with random reads at

130.00 Disk Bytes Per Sec
1600000.00 Disk Read bytes per sec

With a average of

40 - 60 Disk Bytes Per Sec
500000 - 700000 Disk Read bytes per sec

Now I am aware that Dell's AX100 numbers are on paper and do not represent live numbers.

But I'm missing how 30,000 I/Os and 150MB per second corresponds to Disk ready bytes and Disk read per sec.

in closing we believe we're maxing out the disk performance of our san, but want to be clear on the numbers before I suggest a replacement.

Thanks in advance.

Question by:kblackwel
    LVL 46

    Assisted Solution

    the big picture is that benchmarks measure artificial loads that rarely represent YOUR combination of large and small block IOPs and throughput workloads per host and lun.

    you need to approach the problem by determoning YOUR IO requitements.  all os have utilities to report Io Utilization.  use them. learn what you require, then learn where the bottlenecks are.

    your problem is you are lookinng at solutions wo understanding what you need. it is like buying a vehicle based on horsepower ... without bothering to check whether you need a sports car, truck, or airplane.


    Author Comment

    Dear dlethe,

    Then I have to ask,

    I don't know if there's a way to determine how much IOPS we require for our situation.

    We use our SAN for folder redirection of user profiles. I've been over this a few times and no one can tell me what our IOPS requirement would be. If you know a way to measure that with different users using different applications in a remote desktop environment, PLEASE let me know.

    What I think I really have to work with now is determining where the bottle neck is.

    Out San is connected to a Windows 2003 File server machine with a 2 gig fibre to the san and shares the directories through a bonded 2 gig Ethernet connection.

    I'm attempting to put together numbers to verify this, but I can see that

    Network bandwidth is being saturated during heavy morning hours when everyone logs in and pulls their profile and redirected folders.

    The 2 gig fibre connection is also getting saturated during these hours.

    So now I think all I can do is determine if we're bouncing up against the limits of our san and if so expand.

    Am I incorrect in this thinking?
    LVL 51

    Accepted Solution

    Yep, pretty much - but it is all in "how" you determine that you are "bouncing against the limits"

    Difficulty with SAN is measuring the right thing... Need to consider a lot of different things in a SAN configuration such as types of RAID being used, caching etc...

    There is a good insight into the types of measures for Disk IO from the MS website - now it does say for Win Server 2000 - but dont worry about that as much as what the measures are, how they can help, and their analysis. Bit of reading, but very worthwhile (despite winserver 2000).

    Also have a look a the different measures - note the "transfers" which is what you need to compare with the manufacturer supplied data :

    So, there are a few things you will need to consider when trying to measure disk performance, not just read bytes / second, but transfers, Queue length etc... You will also need to undersatand you disk service consumers - are they random access, are they largely sequential, and then configurations / LUN how have they been seperated - is load evenely distributed across a number of spindles etc... which is kinda what dlethe was saying above...

    Also, check for any SAN whitepapers for throughput and "best practices" there is often some additional information "out there" (e.g. google it). But based on the disks being 7200rpm disks, there are choices depending on straight throughput performance, or cost benefit. For example would be using 15Krpm drives in a mulit raid array. but, all of that type of discussion is kind of academic and is really based on your business requirement (and budget).

    There is a white paper for SQL which discusses elements of IO performance as well, while a lot of it is with regard to SQL, there are some general "benchmarks" that can be used for comparative purposes :

    Main points of interest from that document are :

    You can use the following performance counters to identify I/O bottlenecks. Note, these AVG values tend to be skewed (to the low side) if you have an infrequent collection interval. For example, it is hard to tell the nature of an I/O spike with 60-second snapshots. Also, you should not rely on one counter to determine a bottleneck; look for multiple counters to cross check the validity of your findings.

    PhysicalDisk Object: Avg. Disk Queue Length represents the average number of physical read and write requests that were queued on the selected physical disk during the sampling period. If your I/O system is overloaded, more read/write operations will be waiting. If your disk queue length frequently exceeds a value of 2 during peak usage of SQL Server, then you might have an I/O bottleneck.

    Avg. Disk Sec/Read is the average time, in seconds, of a read of data from the disk. Any number

    Less than 10 ms - very good
    Between 10 - 20 ms - okay
    Between 20 - 50 ms - slow, needs attention
    Greater than 50 ms – Serious I/O bottleneck
    Avg. Disk Sec/Write is the average time, in seconds, of a write of data to the disk. Please refer to the guideline in the previous bullet.

    Physical Disk: %Disk Time is the percentage of elapsed time that the selected disk drive was busy servicing read or write requests. A general guideline is that if this value is greater than 50 percent, it represents an I/O bottleneck.

    Avg. Disk Reads/Sec is the rate of read operations on the disk. You need to make sure that this number is less than 85 percent of the disk capacity. The disk access time increases exponentially beyond 85 percent capacity.

    Avg. Disk Writes/Sec is the rate of write operations on the disk. Make sure that this number is less than 85 percent of the disk capacity. The disk access time increases exponentially beyond 85 percent capacity.

    When using above counters, you may need to adjust the values for RAID configurations using the following formulas.

    Raid 0 -- I/Os per disk = (reads + writes) / number of disks
    Raid 1 -- I/Os per disk = [reads + (2 * writes)] / 2
    Raid 5 -- I/Os per disk = [reads + (4 * writes)] / number of disks
    Raid 10 -- I/Os per disk = [reads + (2 * writes)] / number of disks
    For example, you have a RAID-1 system with two physical disks with the following values of the counters.

    Disk Reads/sec            80
    Disk Writes/sec           70
    Avg. Disk Queue Length    5
    In that case, you are encountering (80 + (2 * 70))/2 = 110 I/Os per disk and your disk queue length = 5/2 = 2.5 which indicates a border line I/O bottleneck.

    LVL 46

    Expert Comment

    Great article mark, it is a good balance between over simplification, and enough information to get your point across.  I think a caveat is due in the IOPs calculations for RAID.  Depending in the hardware implementation, stripe size, cache buffers, cache settings, queue depth, workload, RAID level, whether read or write, and I/O request size, then the physical disk I/Os can be profoundly different.  

    Simple example, on RAID1 reads, most engines would do one I/O from whatever disk can service the request sooner.  On writes, 2 I/Os have to get done, but if writeback is enabled, the calling program returns immediately, so could appear to be zero I/Os unless system is loaded.  Or it could be 1 I/O, and it acknowledges only after one disk writes, or with write through cache enabled, the cost for a write is 2 IOs.

    If you have 4 sequential writes, and I/O request size is optimized, then you could end up with only ONE I/O per disk.   Conversely, if the chunk size is incorrect, then 1 host I/O could just as easily require 4 I/Os per disk.

    So if RAID is part of the equation, you MUST look at how it is set up beyond raid level.
    Still it is good enough and a great tutorial.    
    LVL 51

    Expert Comment

    by:Mark Wills
    Thanks dlethe, and absolutely agree with your added comments above (along with your opening post as well)...


    Featured Post

    Enabling OSINT in Activity Based Intelligence

    Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

    Join & Write a Comment

    Solid State (Hard) Drives aka SSD began to evolve in the computer industry recently. As the name suggests, there are no moving parts in the drives. The drive uses microchip memory store the data, as opposed to the spinning disks of a traditional HDD…
    Having issues meeting security compliance criteria because of those pesky USB drives? Then I can help you! This article will explain how to disable USB Mass Storage devices in Windows Server 2008 R2.
    This video teaches viewers how to encrypt an external drive that requires a password to read and edit the drive. All tasks are done in Disk Utility. Plug in the external drive you wish to encrypt: Make sure all previous data on the drive has been …
    This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…

    754 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    24 Experts available now in Live!

    Get 1:1 Help Now