Solved

Files hosting servers setup

Posted on 2012-12-28
7
310 Views
Last Modified: 2013-05-29
Hi,
we are contemplating to start building our files sharing portal and I am curios to see what would experts suggest in terms of the servers setup - mainly disk space.

- For now we decided to go with Linux/Unix OS, MySQL, PHP.

What we approx. expect to have within 2 years:
A. 5,000 members - each member account will have approx. 2 GB of dedicated space
B. Files stored will be each up to 50Mb (zipped files will be split into smaller units). We will not store/share any big files like movies. It will be mostly pictures, graphics, flash, small videos.
C. Our pages will mostly contain pictures. Users will be browsing through many pictures (mostly thumbnails). As for the traffic not sure yet, let's say we will have 20,000 visits a day.

- What would you suggest for the server setup (processor, RAM, drives). For now we plan to go with RAID 1.
- How is it technically done, I mean storing the files across 3 or 4 servers.. Or how the storage (disk space) should be set up?
- We will need something which will support fast uploading and downloading files. What connection should we go with 100 Mbs or 1Gbs? What bandwidth to expect? What data center to use (mainly for North America users).

Please, think of it from the "growing perspective" point of view - starting out with one server and trying to reach our first 1,000 members.

Thank you.
J.
0
Comment
Question by:janime
  • 2
  • 2
7 Comments
 
LVL 10

Expert Comment

by:jmanishbabu
ID: 38727688
Plan for the Purchase of good servers like HP Gen 8 servers which supports all the levels of Raids.

http://h17007.www1.hp.com/us/en/whatsnew/proliantgen8/index.aspx

Configure your Network for Maximum Bandwidth for Users and servers end 1 GBPS would be idle.

Raid 1 is mirroring which will need the same size of Disk Capacity .. All the data will be Mirrored and has redundancy for user data .

Raid 5 will be Idle in your situation where data will be Striped and Parity will be used to store data .. Minimum 3Hard disk is required for Raid 5..
0
 
LVL 10

Expert Comment

by:jmanishbabu
ID: 38727715
0
 

Author Comment

by:janime
ID: 38732144
Thank you guys, both comments are useful.

Still waiting for another inputs/suggestions..

What I'd like to hear is what exactly the best setup should be to start with so we won't be having any problems to keep ADDING SPACE (drives) as we go on.
I just don't feel it's a good idea to spend lots of money and buy a new top notch server while we are still developing and gaining our new members (or maybe yes if that's a necessity).
So again I am looking more at some good STARTING point.

Based on your posts, for now we have agreed on using 100Mbit/1GB port and working with a provider that can continuously/gradually increase the bandwidth according to our needs (we are starting off with 10TB).

Thanks.
0
 
LVL 26

Accepted Solution

by:
dpearson earned 500 total points
ID: 39166403
You can buy servers (for quite cheap) that have a lot of drive bays.

For example Pogo (http://www.pogolinux.com/products/storage_servers) has a server (http://www.pogolinux.com/quotes/editsys?sys_id=251594) that can store up to 72TB itself for only about $5K.

So you could buy 2 of these (for redundancy) and only populate the first couple of drives.  As your needs grow, you just add drives and rebuild the RAID on the fly (you can do all of this without taking the server offline - performance will just drop a bit during the rebuild).

As long as you plan to stay smallish this should work fine.

What we found was that scaling a solution like this to support millions of monthly users was very very hard.  The issue is that all users require access (potentially) to all files, so you need some sort of network storage solution.  We tried NFS and a few other approaches.  All struggle badly to provide good performance (through the network) as you reach high loads.  Obviously for redundancy you need to write each file to both (or <n> servers) and so each write generates a lot of load.  The network traffic tends to rise faster than the linear number of users - so all is great until suddenly you're over the limit and you can't solve it without a full redesign...all while traffic is rising daily.

This is why there are companies like Netapp that sell NAS or SAN solutions that start at $100K and go up from there.  It's a hard problem.

So before you get too deep into this, also do the math to use a third party for this storage.  We eventually went with Amazon's S3 which substantially reduced our storage costs and gave us effectively infinite scalability for free.  There are lots of other cloud storage providers to consider, depending on your needs, but S3 is a solid place to start.  Our users are unaware of this choice - to them they upload a file to our servers (which we then forward to Amazon) and then we use a CDN provider for downloading the files, with Amazon as the origin server.  The end user never sees "Amazon" in the equation.

It all scales well and was much cheaper than the build-your-own that we started with.  It also means I can sleep at night because scaling this file I/O layer was proving our biggest headache as we went from a few thousand users to millions.

Hope that helps,

Doug
0
 

Author Comment

by:janime
ID: 39172403
Thank you Doug! Finally a very valuable input!
This is something that I was expecting to hear.

I'll wait for one or two more replies, but Doug, you have definitely earned some of the points.

Thanks.
J.
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

Create your own, high-performance VM backup appliance by installing NAKIVO Backup & Replication directly onto a Synology NAS!
Moving your enterprise fax infrastructure from in-house fax machines and servers to the cloud makes sense — from both an efficiency and productivity standpoint. But does migrating to a cloud fax solution mean you will no longer be able to send or re…
This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now