Optimising NFS for network homes

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

sameer_dubey

How are the servers connected to the workstations: GbE or less? You might want to check if network itself is the bottleneck.

You could also try the following:

1. Use NFS over UDP only

This link might be useful: http://nfs.sourceforge.net/nfs-howto/ar01s05.html

ASKER

All servers are connected via gigabit to gigabit switches. Most clients connect to gigabit switches also.

CPU usage on servers is 10% max
Network usage is 20MB/s max
Disk activity is 24MB/s max

Typically the network is running 20-40% of available bandwidth

James

You want to look into jumbo frames and also expand your nfs read/write data size. There's lots of guides to NFS performance tuning on the 'net, like this one that doesn't look too out of date.

Also realize that you're never going to be able to do better than disk speed, so make sure that's not a bottleneck.

enable jumbo frames, do this

ifconfig eth0 mtu 9000
Here are the mount options that I use when automounting home directories from our filer

rw,intr,soft,nfsvers=3.tcp,nolock,noatime,rsize=32768,wsize=32768

James

Turning Off Autonegotiation of NICs and Hubs
Sometimes network cards will auto-negotiate badly with hubs and switches and this can have strange effects. Moreover, hubs may lose packets if they have different ports running at different speeds. Try playing around with the network speed and duplex settings

The NFS protocol uses fragmented UDP packets. The kernel has a limit of how many fragments of incomplete packets it can buffer before it starts throwing away packets. With 2.2 kernels that support the /proc filesystem, you can specify how many by editing the files /proc/sys/net/ipv4/ipfrag_high_thresh and /proc/sys/net/ipv4/ipfrag_low_thresh.

Once the number of unprocessed, fragmented packets reaches the number specified by ipfrag_high_thresh (in bytes), the kernel will simply start throwing away fragmented packets until the number of incomplete packets reaches the number specified by ipfrag_low_thresh. (With 2.2 kernels, the default is usually 256K). This will look like packet loss, and if the high threshold is reached your server performance drops a lot.

One way to monitor this is to look at the field IP: ReasmFails in the file /proc/net/snmp; if it goes up too quickly during heavy file activity, you may have problem. Good alternative values for ipfrag_high_thresh and ipfrag_low_thresh have not been reported.

ASKER

Hi, did you mean to link to an NFS guide in your last post?

I'll try jumbo frames.

where do you make these mount options? I've only used the GUI front end to NFS?

Also we are set to use NFS over UDP and TCP - i wondered whether to change this to TCP only as the majority of our network is one large class B network

@sameer dubey -i'll check that link out thanks.

The performance issues you are getting, can you be a little more specific, are they client side or server side ?

ASKER

clients are experiencing slow logins (after the authentication stage), slow document saving and opening, slowdown when browsing folders in their area and general slugishness.

is this confined to a particular time or physical location. Any specific server which can be isolated ?

ASKER

No it is generally site-wide obviously the few clients on a 100Mb network will be slower, but it has been reported in most areas of the site which are all fed by different fibre runs and different switch rooms. Seems to be across all servers and occurs during peak times - 9am - 3.30 pm. So it would seem to be load related.

NFS server threads was set at 20 by default on all file servers. Doubling this to 40 did improve somewhat but not enough.

Are you forced to use NFS? It is abysmally slow under the best of circumstances. I was using NFS 20 years ago, and I think other protocols may have outpaced it just a little :)

However...

You probably looked at this thread http://hints.macworld.com/article.php?story=20030504042711312 but the last comment is interesting: set the export on the server to "async" mode.

> We have set the threads for NFS on the servers to 40 and the clients to use 32.

is there any reason you have the clients set to 32 threads ? having the servers with a high number is useful and will aid performance but 32 threads on the client side seems a little high. If you have a lot of clients that will potentially be a LOT of connections.

ASKER

@ et01267

We have to use NFS, AFP is useless under 10.6 server - it crashes all the time and destroys file permissions. Under 10.5 server it ramps all cpu cores upto 100% and runs like a dog. Under 10.4 Server the access control list implementation is seriously flawed. SMB is not an option as it cannot automount for users home folders, this leaves us with NFS as the only solution.

@ woolnoir

A consultant advised that the client side setting may need changing, we are planning on going to TCP only and 99 server threads on the server.

SOLUTION

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

robocat

Are you sure the problem is situated in NFS and not in the disks being unable to cope ?

How many disks are in the raid configuration and what kind of disks (SAS, SATA ?). Software RAID or hardware RAID ? How many IOPS are you getting on the disk system ?

ASKER

@ et01267

I can't see an option for async in the OS X gui - where should I be looking please?

@ robocat

The problem is definitely situated in NFS as (when) afp was working correctly (on the same hardware and setup) it ran smoothly. The issues were not speed or performance related.

Our 8 file servers use direct attached storage - compromising of non-raided sata disks.
four of the servers have two disks in dedicated to sharing - one sharepoint each, and the other two have single disks dedicated to sharing. The last server has a 16x1TB raid 5 array made up of 15 1 TB sata disks plus 1 hot spare. This is attached via two 4gbps fibre links to the server (active-passive failover to two separate controllers) Even this server experiences the sluggish performance when running network homes.

How can i measure iops please?

It may be in the /etc/nfs.conf file

Check the man page or this:
http://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man5/nfs.conf.5.html

You should look for nfs.client.allow_async and set it 1 -- there may not be a line in the file, in which case you'll need to add one.

YOu'll need to retart the NFS server, and you may want to check how your clients are mounting; the client also has to specify async

roylong

If pretty much all your clients are accessing your servers via 1Gbps and your servers are connected over NFS via 1Gbps too, surely you've got some kind of contention over those links? Do you have any large amount of graphics work going on via those links?

ASKER

@ et01267

Im unsure on enabling that async as it only mentions that writes speed up - the main delays are logging in and opening apps - mainly read operations? Plus the fact data integrity is at risk. It may be worth a tral however.

@ roylong

I would imagine so, however we restrict music and film production to working locally. Music production students save locally, and upload their smaller edited files at the end of a lesson. Film production students use an external drive as a scratch disk and also work locally. When they have a finished project, again they just upload this to their area - these students store their work on the server with the fibre-attached raid storage.

Some students produce graphics work, running from network homes but these files are usually small as they are mostly for print, and we limit the filesizes which the printers can accept.

Also caches, and microsoft user data is re-directed locally to the /tmp folder on the workstations.

Yeah, if I recall, the async is what improved the NFS mount time significantly. Try it and see if it helps.

The data integrity issues are somewhat overstated, I think.

ASKER

I've found the NFS setting on the server under /etc/nfs.conf and have successfully been making changes there.

On the clients however, there seems to be no corresponding conf file - how do i make conf changes on the clients please?

How are configuring the NFS mounts on the clients? I know there are some shareware utilities for MacOS that can do this -- for example I've used NFS Manager

The client settings exist the same /etc/nfs.conf file according to the man page.

The settings for async are nfs.client.allow_async and nfs.server.async

Then you need to set up your mounts so that they specify async.

There are a zillion tuning parameters -- maybe the NFS Manage software has the right mojo.

Oh, one other thing: your clients should probably be mounting with the "soft" option. This also greatly speeds mount times, if I recall.

You can see the NFS Manager server options for async setting here:

Screen-shot-2010-11-09-at-12.10..png

roylong

NFS Manager is a very good application for managing NFS on macintosh - I have used it for many years.

ASKER

Thanks i'll take a look at that tomorrow - how does it work - surely I dont have to set that up on each client - we have nearly 1000 macs. Or is that just a GUI to the /etc/nfs.conf on the servers?

That screenshot seems to relate to the nfs.conf man page i've been looking at.

There is no nfs.conf on the 10.6 clients we're using - I've been looking today and just cannot find it. Perhaps i have to create it?

Well, that's why I asked "how do your mac clients get configured now?"

As far as I know, NFS doesn't have any sort of discovery feature so the Mac automount stuff needs to be set up somehow. I'm not really a XServe savant, so perhaps there is some management utility that can do it.

The NFS Manager help (which is *excellent*) says this:

"NFS Manager can be used to access the Open Directory data of a remote computer over the network. This makes it possible to configure the automount entries of a Mac OS X computer remotely."

ASKER

Ok, heres what happens now:

on fileservers, create shares and assign acls and posix permissions.
set which protocols can be used to access the share, in my case NFS, map root to nobody and set security.
bind fileservers into the directory (open directory master) - basically a pretty openldap server.
enable automount on the shares which contain users home folders - choose which directory to automount them to, and select which protocol, authenticate with directory administrator credentials - thats it no more automount options.

then what you need to know are the default nfs options on the mac clients. hmm. If they are not ideal, then you need to change them (which you can do remotely via NFS Manager)

Also, I found this tidbit here

Unreliable performance, slow data transfer, and/or high load when using NFS and gigabit
This is a result of the default packetsize used by NFS, which causes significant fragmentation on gigabit networks. You can modify this behavior by the rsize and wsize mount parameters. Using rsize=32768,wsize=32768 should suffice. Please note that this problem does not occur on 100Mb networks, due to the lower packet transfer speed.
Default value for NFS4 is 32768. Maximum is 65536. Increase from default in increments of 1024 until maximum transfer rate is achieved.

Oh, also check the nfs man page here

which says somewhere down in the middle:

nfs.conf(5) can be used to configure some NFS client options. In particular, nfs.client.mount.options
can be used to specify default mount options. This can be useful in situations where it is not easy to
configure the command-line options. Some NFS client options in nfs.conf(5) correspond to kernel con-figuration configuration
figuration values which will get set by mount_nfs when performing a mount. To update these values
without performing a mount, use the command: mount_nfs configupdate.

ASKER

I imagine the client nfs configuration is all of the default values from the nfs.conf man page perhaps?

I have found some entries relating to nfs logging in kernel.log on my servers, here is some output from it (attached)

Main errors are nfs send error 32, 35 and 4

Any ideas on what these error codes mean ?

Nov  9 10:33:14 studentdata3-4 kernel[0]: nfsd nfsd send ersnrfsed senor ndnnd3 ef es5rfrod
Nov  9 10:33:14 studentdata3-4 kernel[0]: rr senr od3 er5rs
Nov  9 10:33:14 studentdata3-4 kernel[0]: or r d35
Nov  9 10:33:14 studentdata3-4 kernel[0]: 3 5s
Nov  9 10:33:14 studentdata3-4 kernel[0]: end error 35
Nov  9 10:33:14 studentdata3-4 kernel[0]: nfsd send error 35
Nov  9 10:33:14 studentdata3-4 kernel[0]: nfsd send error 32
Nov  9 10:33:14 studentdata3-4 kernel[0]: nfsdnf sdse nsd eernrodr  error 3232
Nov  9 10:33:14 studentdata3-4 kernel[0]: nfsd snfesd sndend ernrofr  sd32
Nov  9 10:33:14 studentdata3-4 kernel[0]: er rosr end 32e
Nov  9 10:33:14 studentdata3-4 kernel[0]: rror 32
Nov  9 10:51:09 studentdata3-4 kernel[0]: systemShutdown true
Nov  9 10:51:13: --- last message repeated 1 time ---
Nov  9 10:51:13 studentdata3-4 kernel[0]: Kext loading now disabled.
Nov  9 10:51:13 studentdata3-4 kernel[0]: Kext unloading now disabled.
Nov  9 10:51:13 studentdata3-4 kernel[0]: Kext autounloading now disabled.
Nov  9 10:51:13 studentdata3-4 kernel[0]: Kernel requests now disabled.
Nov  9 10:51:14 studentdata3-4 kernel[0]: nfsd send error 4

Open in new window

No ideas on the error codes. However, googling reveals that Mac NFS admins have seen this sort of log output, particularly under heavy load.

One of the things mentioned as a possible culprit for heavy loading was storing cache files (particularly browser cache files) on a remote filesystem vs local storage.

Not sure the best way around this, but possibly:

Get users to turn off cache (good luck)
Get users to relocate cache to a local directory, like /tmp (but the setting for this for Firefox on the Mac is in the user.js, which doesn't exist until you create it in the bowels of the preferences; not sure about other browsers)
Don't mount the Home directory, but some other (sub-)directory, like Documents.

But perhaps tweaking the NFS config parameters may help.

ASKER

We redirect the library/caches folder and microsoft user data folders locally so this should reduce most of the caching from the servers.

nfs client conf file has had async set and threads set to 16
server nfs conf file has threads set to max of 131 (i cant allocate any more) and async enabled

Unfortunately we have to mount the entire home directory as our users are very mobile (its a college so users log into computers upto 7 times per day in different rooms).

With regard to packet size, we still do have some users on 100 mbit connections or wireless - about 25% i would estimate. Would setting the packet sizes cause a detrimental effect to these users?

Also where do i set the rsize and wsize parameter? There is no entry in the nfs.conf man page for those commands?

Have you looked to see whether the mounts are happening over UDP or TCP? TCP is preferred, and UDP may generate lots of fragments which will cause havoc (particularly with large frame sizes). The article referenced below discusses diagnostics.

You have probably read this article (which is all good), but this section has some interesting bits on autonegotiation. There are some other suggestions for things to look for in the nfsstat and netstat output. A lot may depend on whether you are getting a lot of collisions.

I'm not sure, but I think the rsize and wsize are set in the mount arguments on the client; these are the starting place for negotiation between client and server. Some sources say to leave these alone, others say to make them big. I'm unsure whether you can set these on the server.

From the mount_nfs man page:

rsize=<readsize>
Set the read data size to the specified value. The default is 8192 for UDP mounts and 32768
for TCP mounts. It should normally be a power of 2 greater than or equal to 1024. Values
greater than 4096 should be multiples of 4096. It may need to be lowered for UDP mounts when
the ``fragments dropped due to timeout'' value is getting large while actively using a mount
point. (Use netstat(1) with the -s option to see what the ``fragments dropped due to timeout''
value is.)

wsize=<writesize>
Set the write data size to the specified value. Ditto the comments w.r.t. the rsize option,
but using the ``fragments dropped due to timeout'' value on the server instead of the client.
Note that both the rsize and wsize options should only be used as a last ditch effort at
improving performance when mounting servers that do not support TCP mounts.