Hello all. I'm sorry if this is a long one, but hopefully it'll be worth the read... the question is at the end.
I work as a networking consultant for a University and I had a requirement from some academics to provide an isolated lab of 25 machines in which students could install operating systems, save them to networked storage at the end of a session, then restore them the next week and carry on working.
Operating systems in use include Windows XP (with office & other software), RHEL, SuSE. The brief was that the imaging process must not take longer than 10 minutes to save or restore (with all machines imaging concurrently), and that support be available for NTFS, FAT16/32, Ext2, Ext3, ReiserFS v4 and XFS (that rules out Ghost/DriveImage), on a purely used-block-only basis. No imaging software must reside on the lab machines, so booting to network would have to be used to initiate the process, and it must be as slick as possible, the the minimum neccessary user interaction (ie. enter username, password, imaging takes place, reboots).
The current hardware and software configuration is as follows:
HP ProLiant ML370, Dual P4 Xeon 3.4Ghz, 5Gb RAM, SmartArray 642 RAID Controller (with additional cache memory module). Connected to...
Ultra 320 SCSI HP MSA500 SAN loaded with 14x300Gb Ultra 320 15k SCSI discs in RAID10 config with hotspares.
Server has dual Intel Pro/1000 fiber NICs setup in an MLT and linked to a Nortel 5510 switch at 2000Mbps full duplex.
All PC's are HP DL380 P4 3.2Ghz 1024Mb towers with Broadcom copper gigabit NICs all running at 1000Mbps full duplex to the Nortel 5510.
Some crap PC in the corner running SuSE providing DHCP/PXE and TFTP services.
The server runs Novell Netware 6.5 SP5, SYS pool/volume on local drive, and IMAGES pool/volume on SAN. Netware was chosen as the OS because Novell's NSS filesystem performance absolutely slaughtered any Microsoft/Linux offering for brute strength of concurrent I/O during benchmark tests. The server is setup with Native File Access loaded so can be mounted via NCP (IP or IPX), SMB, NFS and AFP.
Client machines boot to PXE and are presented with a SAVE or RESTORE type menu c/o PXElinux.
They make their choice and a 40Mb bespoke Linux environment is booted into a ramdrive.
Linux Live scripts were used to init all hardware, set everything to DMA, and all NICs come up at 1000Mbps full duplex with IP's.
The server is automounted (currently via Samba, but NCP and NFS speeds were identical).
A shell script is executed prior to shell spawn (no ctrl-c allowed, students are naughty) and they are prompted for their username and password.
Shell script dd's MBR to $username-mbr then pipes output of sfdisk to a loop which ascertains number of partitions and images them 1 by 1 using GNU PartImage, gzipping on the fly, to the server in the format $username-hdaX.000 etc
After all done, drive is wiped by dd'ing 512 bytes of nothing to MBR, then machine reboots.
Similar for restore. Speed both up and down is around 800MB/min per machine, with all 25 running at the same time. (RAID10 was neccessary to maintain satisfactory write speeds, it was garbage on RAID5).
This all works rather well indeed, and everyone's my friend. However, now I've been tasked with further improving the performance as they plan to add extra machines to the lab and I really don't know how much faster I can make it, and if so how.
Obviously the problem is mainly one of I/O on the storage.
I've thought of using hardware SAN mirroring, multiple servers, another switch, and load balancing based on MAC address.
Also thought about moving to serial attached SCSI instead of parallel for improved performance.
10Gbps networking is currently prohibitively expensive, and since the networking isn't the bottleneck probably wouldn't help much anyway.
So... storage experts, give me your best shot! How can I create a notable performance increase? Perhaps not even a performance increase, more a capacity increase as the current speeds are fine, but the system would need to maintain them with more clients connected.
I can spend a decent amount of money, but not too many thousands (that's £ not $ btw, so multiply the figure in your head x2).