Solved

Netapp, RedHat, ReadyNAS and latency oh my!

Posted on 2012-03-20
19
865 Views
Last Modified: 2012-04-30
Netapp 3020
ReadyNAS 1000s and 1100
Redhat Enterprise ver 3


The issue is extreme latency in copying data from an NFS netapp share to a NAS via a Redhat machine.

Background: two weeks ago we replaced a Netapp F740 with the 3020.  Config was mirrored over to the 3020 and the only issue we had that weekend was a Linux web server needed "ver2" added to fstab in the mount lines for Netapp NFS shares.

Currently the fstab files for all three Redhat machines is identical.  An example of such would be:

LABEL=/                 /                       ext3    defaults        1 1
LABEL=/boot             /boot                   ext3    defaults        1 2
none                    /dev/pts                devpts  gid=5,mode=620  0 0
none                    /proc                   proc    defaults        0 0
none                    /dev/shm                tmpfs   defaults        0 0
/dev/sda3               swap                    swap    defaults        0 0
/dev/cdrom              /mnt/cdrom              udf,iso9660 noauto,owner,kudzu,$
netapp:/vol/vol0/custom /custom nfs vers=2,rw,hard,intr,bg 0 0
netapp:/vol/vol0/cnc /cnc nfs vers=2,hard,intr,rw,bg 0 0
netapp:/vol/vol0/pd /pd nfs vers=2,hard,intr,rw,bg 0 0
netapp:/vol/vol0/eweb /eweb nfs vers=2,rw,hard,intr,bg 0 0
netapp:/vol/vol0/web /custom/net/web nfs vers=2,rw,hard,intr,bg 0 0
netapp:/vol/vol0/pd /pd nfs vers=2,rw,hard,intr,bg 0 0
netapp:/vol/vol0/home /usr/people nfs vers=2,rw,hard,intr,bg 0 0
netapp:/vol/vol0/nov1 /novell nfs vers=2,hard,intr,rw,bg 0 0
netapp:/vol/vol0/eng      /eng      nfs    rw,bg,intr
10.2.1.12:/Archive4 /data/archive/archive4 nfs
10.2.2.10:/archive /data/archive/archive6 nfs rw 0 0
10.2.1.12:/RecentData /custom/archive nfs    rw,bg,intr

Open in new window


The 10.2.112 and 10.2.1.10 are two different readyNAS units.  All shares mount successfully, however a 'cp' from custom/archive to RecentData takes literally 5 times longer than it did with the old netapp.  Which spec-wise alone makes zero sense.

I have tried finding out what version of NFS the ReadyNAS units support but have been unsuccessful so far.  I've also no found out what versions RH Ent 3 supports.  I've thought about looking for an upgrade to the nic driver in use on RH.  But not being completely adept at Redhat I am very unsure of how to proceed.

CIFS shares work beautifully, copying from Netapp to NAS and vice/versa. It's NFS that's giving trouble.  Browsing, 'ls'ing, mkdir's, and cp'ing single files or small folders is fine.  But the normal procedure that the latency is an issue is dealing with multiple gig's of data, anywhere from 2-20gb at a time.

I have probably not provided enough information here but can anyone help steer me in the right direction?

Thanks!
0
Comment
Question by:Ben Hart
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 12
  • 5
  • 2
19 Comments
 
LVL 78

Expert Comment

by:arnold
ID: 37745815
Check the network interface configuration autonegotiate versus fixed.  check the port to make sure the settings on the network interface match the switch config i.e. both fixed or both autoneg. Make sure there are no CRC errors on the switch which could mean that there is a mismatch in the configuration.

netapp to redhat to readynas
does redhat have a single or multiple interfaces?

The redhat is working as a buffer for the data being transferred.
0
 
LVL 34

Expert Comment

by:Duncan Roe
ID: 37746369
I would run tcpdump and see what is happening. Expect to see retries.
(I once fixed a problem with NFS mounts in VMware that way - the emulated NIC could not handle the 8KB UDP chunks used by NFS and limiting to 1KB (in fstab) restored normal operation)
0
 
LVL 14

Author Comment

by:Ben Hart
ID: 37747148
Thanks guys, first off:
I looked on both the Netapp and the ReadyNAS and if I want to manually set the speed and duplex my only options are in the 100mb range.  It seems if I want gigabit I am forced to let it auto negotiate.  The switch however is a Cisco 3650g so I did statically set the port to 1gig and Full.  The nas, netapp and RH boxes all are reporting 1gig and Full if it makes any difference.  The switch showed zero CRC errors on that port, as well as the port errors on the NAS showed zero on all counts.

tcpdump scrolled too fast for me so I'm going to try piping it to a txt file then start a file copy and see if anything jumps out.
0
Save the day with this special offer from ATEN!

Save 30% on the CV211 using promo code EXPERTS30 now through April 30th. The ATEN CV211 connects a laptop directly to any server allowing you instant access to perform data maintenance and local operations, for quick troubleshooting, updating, service and repair.

 
LVL 14

Author Comment

by:Ben Hart
ID: 37747222
Eureka!  tcpdump while trying the normal copy process on RH from netapp to nas resulted in a literal ton of fragmented datagrams..

The MTU on all hosts involved is 1500 even, now I'm confused.
0
 
LVL 78

Expert Comment

by:arnold
ID: 37747892
Look at the NFS windowing. Rsize, wsize
Datagrams? I thought you have nfsv2 fragmented are coming from the netapp?
 tcp versus udp.
0
 
LVL 14

Author Comment

by:Ben Hart
ID: 37747915
09:32:01.909481 rh2.unifiedbrands.net.2097126045 > Archive4.difc.root01.org.nfs: 1416 write [|nfs] (frag 31340:1424@0+)
09:32:01.909485 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31340:1424@1424+)
09:32:01.909487 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31340:1424@2848+)
09:32:01.909489 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31340:1424@4272+)
09:32:01.909490 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31340:1424@5696+)
09:32:01.909492 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31340:1240@7120)
09:32:01.909518 rh2.unifiedbrands.net.2113903261 > Archive4.difc.root01.org.nfs: 1416 write [|nfs] (frag 31341:1424@0+)
09:32:01.909519 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31341:1424@1424+)
09:32:01.909521 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31341:1424@2848+)
09:32:01.909523 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31341:1424@4272+)
09:32:01.909524 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31341:1424@5696+)
09:32:01.909526 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31341:1240@7120)
09:32:01.909550 rh2.unifiedbrands.net.2130680477 > Archive4.difc.root01.org.nfs: 1416 write [|nfs] (frag 31342:1424@0+)
09:32:01.909552 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31342:1424@1424+)
09:32:01.909554 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31342:1424@2848+)
09:32:01.909555 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31342:1424@4272+)
09:32:01.909557 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31342:1424@5696+)
09:32:01.909559 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31342:1240@7120)
09:32:01.948161 rh2.unifiedbrands.net.2147457693 > Archive4.difc.root01.org.nfs: 1416 write [|nfs] (frag 31343:1424@0+)
09:32:01.948167 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31343:1424@1424+)
09:32:01.948169 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31343:1424@2848+)
09:32:01.948171 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31343:1424@4272+)
09:32:01.948173 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31343:1424@5696+)
09:32:01.948175 rh2.unifiedbrands.net > Archive4.difc.root01.org: udp (frag 31343:1240@7120)

Open in new window


This is what Im seeing.. from redhat box to Nas.  how do I check the rsize and wsize and on what device?
0
 
LVL 14

Author Comment

by:Ben Hart
ID: 37747990
MTU sizes were still all 1500 on rh2, archive4 and the netapp.  Right now I have them all connected to the same switch. Had to bounce the readynas so Im waiting for it to come back up then I'll test again.  Shouldnt make any difference I know but...
0
 
LVL 78

Expert Comment

by:arnold
ID: 37748018
You're using udp, fragments in this case means the fragment of a file rather than the packet was fragmented (part of the header has fragmented set to true.)
http://nfs.sourceforge.net/nfs-howto/ar01s05.html
http://web.mit.edu/rhel-doc/5/RHEL-5-manual/Deployment_Guide-en-US/s1-nfs-client-config-options.html
0
 
LVL 14

Author Comment

by:Ben Hart
ID: 37748064
Thanks for the link.. should I be concerned about:
tracepath archive4
 1:  rh2 (10.2.1.41)                      asymm 65   0.017ms pmtu 552

Open in new window

0
 
LVL 14

Author Comment

by:Ben Hart
ID: 37748086
Actually.. Im getting the 552 pmtu on any host I specify.  Surely that's not normal.
0
 
LVL 14

Author Comment

by:Ben Hart
ID: 37749474
Ok so I setup the required NFS mounts on a fresh Ubuntu 11.10 install, ran the exact same cp string as earlier and it completed very quickly.  I tried a tcpdump like before as well but didn't even see my Ubuntu host mentioned, possibly a config difference with that or IDK.  Either way the plan going forward is to blow away friggin old RH and replace it with Fedora 16 just to process this Archiving sequence.

Disappointing the actual issue wasn't discovered but engineering is pushing hard to get some sort of resolution asap.
0
 
LVL 78

Expert Comment

by:arnold
ID: 37749508
Centos 5 or 6 is an option as well.
Does the current redhat 3 have a gigE network interface?
0
 
LVL 34

Expert Comment

by:Duncan Roe
ID: 37749844
The tcpdump output is OK. You can expect UDP fragments - when I had a problem there were extra lines of output indicating unfinished I think (it was a long time ago).
On the new system, reverse host name look up may not have been working for your Ubuntu host but you should have seen its IP address instead. Otherwise, which addresses are you seeing?
MTU of 1500 is standard - everyone uses it.
0
 
LVL 14

Author Comment

by:Ben Hart
ID: 37749916
I didn't notice the IP of Ubuntu either, also there was ALOT less entries in this dump than from the RH boxes.  But I figured that was because the Ubuntu laptop was new and RH2 has been around for ever and would've been in the arp lists for every switch and cached in alot of servers.

The RH box had a 100mb nic, which once it's rebuilt on Fedora I'll pull that card and let it use the gigabit on-board adapter.
0
 
LVL 78

Expert Comment

by:arnold
ID: 37749924
The connection from the new ubuntu may have been nfsv3 rather than nfsv2 which you said was the option available.

I'd stick with the server thread using RH 5,6  or Centos 5/6 rather than the desktop version of fedora.
0
 
LVL 14

Author Comment

by:Ben Hart
ID: 37750049
It might have.. I did remove the 'vers=2' from the three fstab lines I added to the Ubuntu machine before mounting them.  RH3 probably doesn't support NFS3?

The dev who setup the crontab for the entire archiving process, which copies data then modifies an Informix database will own this new box so I told him to use whatever OS he felt comfortable with.  If I was me I'd be sticking with Ubuntu but Im a noob so..
0
 
LVL 14

Author Comment

by:Ben Hart
ID: 37806540
Any other opinions?  Should I had been worried about the very small PMTU?

The Redhat boxes did not have gig interfaces because it's rh3.. apparently  I was told ver3 does'n't support gigabit.  But I also discovered that the drives in the NAS's are WD Greens so it seems there's at least a handful of things that are all possibly contributing to the overall slow pace of data transfers.
0
 
LVL 14

Accepted Solution

by:
Ben Hart earned 0 total points
ID: 37894543
Ok well there are no other opinions I take it so I'm going to answer this by saying that the slowness must be because of the drive interface on the Netgear nas devices coupled with the 100mb interface on that RH box.
0
 
LVL 14

Author Closing Comment

by:Ben Hart
ID: 37909703
not the answer I was looking for, but it's all I can get apparently.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Learn about cloud computing and its benefits for small business owners.
Each year, investment in cloud platforms grows more than 20% (https://www.immun.io/hubfs/Immunio_2016/Content/Marketing/Cloud-Security-Report-2016.pdf?submissionGuid=a8d80a00-6fee-4b85-81db-a4e28f681762) as an increasing number of companies begin to…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.
Suggested Courses

734 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question