[Webinar] Streamline your web hosting managementRegister Today


Files hosting servers setup

Posted on 2012-12-28
Medium Priority
Last Modified: 2013-05-29
we are contemplating to start building our files sharing portal and I am curios to see what would experts suggest in terms of the servers setup - mainly disk space.

- For now we decided to go with Linux/Unix OS, MySQL, PHP.

What we approx. expect to have within 2 years:
A. 5,000 members - each member account will have approx. 2 GB of dedicated space
B. Files stored will be each up to 50Mb (zipped files will be split into smaller units). We will not store/share any big files like movies. It will be mostly pictures, graphics, flash, small videos.
C. Our pages will mostly contain pictures. Users will be browsing through many pictures (mostly thumbnails). As for the traffic not sure yet, let's say we will have 20,000 visits a day.

- What would you suggest for the server setup (processor, RAM, drives). For now we plan to go with RAID 1.
- How is it technically done, I mean storing the files across 3 or 4 servers.. Or how the storage (disk space) should be set up?
- We will need something which will support fast uploading and downloading files. What connection should we go with 100 Mbs or 1Gbs? What bandwidth to expect? What data center to use (mainly for North America users).

Please, think of it from the "growing perspective" point of view - starting out with one server and trying to reach our first 1,000 members.

Thank you.
Question by:janime
  • 2
  • 2
LVL 10

Expert Comment

ID: 38727688
Plan for the Purchase of good servers like HP Gen 8 servers which supports all the levels of Raids.


Configure your Network for Maximum Bandwidth for Users and servers end 1 GBPS would be idle.

Raid 1 is mirroring which will need the same size of Disk Capacity .. All the data will be Mirrored and has redundancy for user data .

Raid 5 will be Idle in your situation where data will be Striped and Parity will be used to store data .. Minimum 3Hard disk is required for Raid 5..
LVL 10

Expert Comment

ID: 38727715

Author Comment

ID: 38732144
Thank you guys, both comments are useful.

Still waiting for another inputs/suggestions..

What I'd like to hear is what exactly the best setup should be to start with so we won't be having any problems to keep ADDING SPACE (drives) as we go on.
I just don't feel it's a good idea to spend lots of money and buy a new top notch server while we are still developing and gaining our new members (or maybe yes if that's a necessity).
So again I am looking more at some good STARTING point.

Based on your posts, for now we have agreed on using 100Mbit/1GB port and working with a provider that can continuously/gradually increase the bandwidth according to our needs (we are starting off with 10TB).

LVL 28

Accepted Solution

dpearson earned 2000 total points
ID: 39166403
You can buy servers (for quite cheap) that have a lot of drive bays.

For example Pogo (http://www.pogolinux.com/products/storage_servers) has a server (http://www.pogolinux.com/quotes/editsys?sys_id=251594) that can store up to 72TB itself for only about $5K.

So you could buy 2 of these (for redundancy) and only populate the first couple of drives.  As your needs grow, you just add drives and rebuild the RAID on the fly (you can do all of this without taking the server offline - performance will just drop a bit during the rebuild).

As long as you plan to stay smallish this should work fine.

What we found was that scaling a solution like this to support millions of monthly users was very very hard.  The issue is that all users require access (potentially) to all files, so you need some sort of network storage solution.  We tried NFS and a few other approaches.  All struggle badly to provide good performance (through the network) as you reach high loads.  Obviously for redundancy you need to write each file to both (or <n> servers) and so each write generates a lot of load.  The network traffic tends to rise faster than the linear number of users - so all is great until suddenly you're over the limit and you can't solve it without a full redesign...all while traffic is rising daily.

This is why there are companies like Netapp that sell NAS or SAN solutions that start at $100K and go up from there.  It's a hard problem.

So before you get too deep into this, also do the math to use a third party for this storage.  We eventually went with Amazon's S3 which substantially reduced our storage costs and gave us effectively infinite scalability for free.  There are lots of other cloud storage providers to consider, depending on your needs, but S3 is a solid place to start.  Our users are unaware of this choice - to them they upload a file to our servers (which we then forward to Amazon) and then we use a CDN provider for downloading the files, with Amazon as the origin server.  The end user never sees "Amazon" in the equation.

It all scales well and was much cheaper than the build-your-own that we started with.  It also means I can sleep at night because scaling this file I/O layer was proving our biggest headache as we went from a few thousand users to millions.

Hope that helps,


Author Comment

ID: 39172403
Thank you Doug! Finally a very valuable input!
This is something that I was expecting to hear.

I'll wait for one or two more replies, but Doug, you have definitely earned some of the points.


Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

How much do you know about the future of data centers? If you're like 50% of organizations, then it's probably not enough. Read on to get up to speed on this emerging field.
The following information will get you familiar with your new DV server, including the (mt) Account Center, the Plesk Control Panel, our world-renowned support department and the rest of the (mt) tools that come with your new service.
This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
Suggested Courses

640 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question