tolerance to latency/bandwidth constraints of JSP-based applications doing file read/writes to shares

We have a JSP web app (runs on Tomcat on Windows) that needs to do file read/writes.  We want to distribute the web servers, but need a central file server so we know we're going to introduce latency as it reads/writes to the shared folder (\\server\folder type sharing via SMB).

Is there some sort of document or any good articles on the latency ceiling for this?  I expects it's not a hard limit, but just that at some point the latency/bandwidth leads to the process taking longer and hence more web server resources will be consumed (from the users making the requests that lead to this read/write).  Probably mainly memory for the threads/code running while they're running.  What I'm trying to find is anything about failures.  For instance if reading a 50 k file is making calls and there's 1000 ms or 2000 ms or etc. what's the effect of that.  Is there a point (3000 ms latency) or other things of that sort at which point the read/write will generate errors (in other words timeouts and such).
Gene KlamerusTechnical ArchitectAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Travis MartinezStorage EngineerCommented:
I believe the SMB timeout is a default of 60 seconds before it will call the transaction dead but this is configurable on the client side.  I don't know that you'd want to set it to anything other than that and in your examples above, latency into the seconds range is horrible for access to a SMB share.

What I think you're wanting to know is what the capabilities are of the storage subsystem serving up the file share.  That's going to be different depending on the array that is being used but there are some basics that can be derived if you know the disk configuration and the type of disks.

Let's assume that it's a 10K drive which industry standards accept that as 150 IO's per disk.  Block size is important but that comes later.  Here's an example that may help.

Say I have a disk subsystem with an SMB share on a RAID5 with 10 disks.  That means the total number of IO's capable, within tolerances that is, would be 10 X 150 = 1,500 total back end IO's capable.

With that said now you need to know what the Read/Write ratio is for your system.  Let's say it's going to be 80/20.  The equation for this would be:

FE IO's = (BE IO / ( Read % + ( Write % * RAID Penalty )))  =  833 = ( 1,500 / ( 0.8 + ( 0.2 * 5 ) ) )

Therefore for a RAID5 ,10 disk system with 10K drives, you have 833 FE IO's possible by staying within the disk tolerances and acceptable latency.

There's additional information such as bandwidth and block size, Windows reads at 64K and writes at 128K but I don't think bandwidth will be an issue.

Hope this helps.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Gene KlamerusTechnical ArchitectAuthor Commented:
Actually, i'm not asking any of that.

I'm going to be deploying a web (JSP on Tomcat) application in an IaaS Azure environment.  The application is going to need to store perhaps 100,000 files daily and return up to 25,000 of them daily.  The files themselves are 50k to 100k as a rule.

My need is for deploying my application with resilience to physical site disaster events.  My company cloud delivery team is asking me to provide them my tolerances so they can get underlying capabilities I need.

There are other parameters I'll be providing, but I'm currently trying to gather the ceilings on latency that will allow me to hit my service levels I have to deliver to my users.

I'm currently storing or returning documents in about .8-1.5s within the company data center.  I'm willing to tolerate 2-3s at Azure.  My testing with a simple proof-of-concept in Azure is actually faster than that, but I don't have a distributed architecture at Azure yet.  My users won't tolerate 5s.  That's my ceiling.

So I have to parse my SLA down into time spent reaching Azure from the data center, time spent within my web app, and time spent on read/writes.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.