• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 826
  • Last Modified:

Inconsistant data write speeds to iSCSI LUN mount point

We have a 64-Bit RHEL 5.x VM running on a VMware ESXi 4.0 server. Attached to said RHEL5 VM is an iSCSI LUN mount point that is originating from a NetApp FAS6040 SAN bound with a 10GbE server adapter. We only use the iSCSI environment strictly for Oracle data/log storage needs. So far, we have noticed that the data read-write transfer speeds have been rather inconsistent. Using a simple 'dd' script to create a 512MB file on the attached iSCSI LUN, we have noticed the initial file creation speed is extremely fast, however after that point the speed in which the file is deleted and then recreated drops enormously (see data below).

Things I have tried on the RHEL VM with limited success:
- Changing the I/O Scheduler (cfq, deadline or noop; /sys/block/sdb/queue/scheduler)
- Verified that VM is using the VMware vmxnet3 virtual NIC driver
- Verified that vNIC is set for highest possible speed with auto negotiation turned off (output below)
- Changed mount point parameters to disable "atime" (/dev/vg01/oracle        /u10          ext3    _netdev,defaults,nodev,noatime,nodiratime        0 0)

>> ethtool eth1
Settings for eth1:
        Supported ports: [ TP ]
        Supported link modes:   1000baseT/Full
                               10000baseT/Full
        Supports auto-negotiation: No
        Advertised link modes:  Not reported
        Advertised auto-negotiation: No
        Speed: 10000Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: off
        Supports Wake-on: uag
        Wake-on: d
        Link detected: yes

What is required to keep the data transfer speed totally consistent -- at least running around the 300MB/s mark?

Running the Bang up NetApp ISCSI time #1 on /u10 at 512MB
512+0 records in
512+0 records out
536870912 bytes (537 MB) copied, 0.968507 seconds, 554 MB/s

Running the Bang up NetApp ISCSI time #2 on /u10 at 512MB
512+0 records in
512+0 records out
536870912 bytes (537 MB) copied, 9.5014 seconds, 56.5 MB/s

Running the Bang up NetApp ISCSI time #3 on /u10 at 512MB
512+0 records in
512+0 records out
536870912 bytes (537 MB) copied, 8.20284 seconds, 65.4 MB/s

Running the Bang up NetApp ISCSI time #4 on /u10 at 512MB
512+0 records in
512+0 records out
536870912 bytes (537 MB) copied, 7.83803 seconds, 68.5 MB/s

Running the Bang up NetApp ISCSI time #5 on /u10 at 512MB
512+0 records in
512+0 records out
536870912 bytes (537 MB) copied, 8.80186 seconds, 61.0 MB/s

Open in new window

0
Michael Worsham
Asked:
Michael Worsham
  • 2
  • 2
1 Solution
 
nociSoftware EngineerCommented:
Not sure, but are you sure the scratch file is removed between sessons. The delete of the oldfile (being overwritten) is also part of the action (below the surface).
Also after attempt one the caches in the SAN might fill up (how big are those?), getting you down to raw disk read& write speeds.
Another think can be caching in your system. causing the first io to be fast (write to local RAM) while the others still are being written.

The slowdown is probably a combination of the above.
To better get verifyable results:
Delete the scratch / test files.
run (sync;sync;sync) before a measurement to evict the cache
start timer
run the test
(sync; sync; sync)   # to also flush all buffers
stop timer

And calculate the speed form those times.
0
 
Paul SolovyovskyCommented:
Are you connecting LUNs from the Netapp inside of the VMs?  If so have you installed host tools on the VM, I've done it on the windows systems but not on linux..should be worth looking into.

0
 
Michael WorshamInfrastructure / Solutions ArchitectAuthor Commented:
This is a copy of the script I am using the test the iSCSI LUN that is mounted on the RHEL VM for testing read-write throughput.


#!/bin/bash
#
# testiscsi.sh - Test iSCSI read-write throughput on RHEL
#

FILENAME=/u10/test.512M
BLOCKSIZE=1M
BLOCKCOUNT=512
COUNT=3

x=1
while [ $x -le $COUNT ]
do
 echo "--- Write #$x ---"
 dd if=/dev/zero of=$FILENAME bs=$BLOCKSIZE count=$BLOCKCOUNT conv=fsync

 # Free pagecache, dentries and inodes (prevent kernel-side caching)
 echo 3 > /proc/sys/vm/drop_caches
 sync
 echo 0 > /proc/sys/vm/drop_caches

 echo "--- Read #$x ---"
 dd if=$FILENAME of=/dev/zero bs=$BLOCKSIZE count=$BLOCKCOUNT

 rm -f $FILENAME
 x=$(( $x + 1 ))
done

Open in new window

0
 
nociSoftware EngineerCommented:
Well you can influence local caches, but how about caches on the iscsi server.
0
 
Michael WorshamInfrastructure / Solutions ArchitectAuthor Commented:
It seems to be pointing to the SAN cache. The primary SAN engineer has left the company and didn't give others a way to followup, so we are screwed for the moment.
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

  • 2
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now