Best content replication methods for load balanced IIS webfarm?


I'm doing some research into how I can turn my relatively primative windows servers separate webservers into a more sophisticated load balanced server farm. We host hundreds of copies of our own ecommerce software, meaning there are many many files, but few changes on a day to day basis.

I'm ticked off some key areas in my research already, network load balancing plus some service monitoring controlling it, no problem. IIS metabase updates, no problem.

But I am stuck how to replicate the actual sites contents. As all the websites have client control, any updates, e.g. new images uploaded via their webinterface need to be very quickly replicated onto all webservers.

Solutions like robocopy are not fast enough for me ( I have 2.5 million files to replicate! ) even though there is hardly any changes ever happening. I suspect I'd get better resulted with something like rsync (or windows wrapped version) but I'm hesitant to write my scripts with that in case it doesn't scales to the full 2.5 million files well.

Another way is to serve the content off a single SAN/NAS. However I only have 10/100 nics to play with, and I'm worrying that the network will become a bottleneck with that.

I believe there are some commercial solutions out there too. (but are they any better than e.g. rsync?)

So it appears windows distributed file system with DFS replication seems a good bet to sync my content. However I fell over with this method as it requires a domain, and no, I don't have a domain in my network yet.

So my questions are these:

* I'd really like to avoid adding a couple of DCs to my environment just to get this to work, can anyone tell me whether this might be worth my while, e.g. is very easy or worth it?
* Any other way to get DFS replication working without a domain?
* Any other 3rd party equivilent for window DFS replication that can work in my enviroment?
* Any other tips suggestions to get me back on track?

I'm frustrated that I've got so far on my research, but fell on this hurdle.
Who is Participating?
Borgs8472Connect With a Mentor Author Commented:
To be honest, I don't like the idea of not being able to infinately scale the SAN, and the network as a point of failure. Also I have bad experience SAN technologies with our managed hosting we're moving away from which was a SAN/blade service where our hosts were incapable of providing any kind of consistant quality of service.

In my small company, it takes a cheap prototype to get any sort of solution implemented which is why I'm working on NLB's webservers (in my own time!) to pitch for permission for limited roll out at zero hardware cost. I think the homogenous nature of our hosting is better suited for this approach. (buy them cheap, stack 'em high!)

Searching specifically for implimentations of rsync for two way replication, I came across a free rsync-like app call unison. which is explicitly designed for two-way replication like I'm after. (where as rsync is primarily a fancy backup tool)

I'm going to prototype this, whilst apparently I'm in for a long wait on its first replication parse as it builds its metadatabase, it should be very fast subsequently.

I'm thinking lack of distributed file system alternatives to microsofts is the price I pay for using windows. Any final ideas?
Ted BouskillSenior Software DeveloperCommented:
Well definitely avoid a brute force approach like Robocopy.  rsync is extremely efficient with very little handshaking and should scale superbly.  I'm working on a high-speed file proxy transfer system and we are implementing rsync as the best of bread choice.

However overall I'd recommend a shared SAN drive with fiber channel HBA cards.  Time is money and building or buying a software solution might be better spent on high speed hardware.
Ted BouskillSenior Software DeveloperCommented:
Microsoft has a nice DFS solution but it requires a domain which you are trying to avoid, so I think rsync is your best choice other than some fancy code.

For example, I've written custom ASP.NET image controls that grab files from a database or redundant file share then write them to the Http stream on the fly.  It was for a shopping site so the images used didn't have to be stored on all the servers and I could control the caching to minimize web farm TCP/IP traffic.
Hi, I'm browsing the internetz in a search of an advice about the same topic. I have 10 web servers that have identical content, but it must be updated by hand every time, which obviously does not scale well.

I have a NetApp2020 that hosts my NFS volumes. It also has and iSCSI license. I want to somehow point all my webserver to a SAN share for content in a read-only mode and have a master server only update content there.

I can't seem to find anything that would allow me to do that with what I have at this time because windows is a "share-nothing" iscsi model and NFS does not seem trivial on windows.

Are there any resources on how people do large web farms in regards to content?
Ted BouskillConnect With a Mentor Senior Software DeveloperCommented:
Sorry, no.  Most organizations roll their own solution and keep it in house.  IIS can use a UNC path for a website or virtual folder so if you can setup a high available file share for all servers you could use that strategy where all the files are in one location and each IIS server access the UNC path.

I think you could do that without a domain controller.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.