• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 185
  • Last Modified:

does this project can be done on apache

Hi all,
I need your help as proffessionals in that :
I got thoes requirement from the system analyst
- estimated number of concurrent users per second on the web application is 5 million user
- estimated number of records per database is 40 million record
- record size is 20 kilo byte

I'm responsible for technical specification but I'm newbie
- does apache can take that huge ammount of traffic and number of users per second
 if so how about the stability and security and performance and speed in that case is it good enough or it 'll be poor???
best Regards
HG
0
shang3000
Asked:
shang3000
  • 9
  • 7
1 Solution
 
rixlabsCommented:
With this ammount of data and user is better to create a load balanced cluster of web servers...

the size of your database disk is something like 800 TB so you need to plan a storage server
0
 
shang3000Author Commented:
hi rixlabs,
thanks for reply
can you give me more details :
sites, factsheet, tutorial, article, link or book that can help me to do that
thanks in advance
Best Regards
HG
0
 
giltjrCommented:
Wait, you plan to have 5 million ( that is 5,000,000) unique users PER SECOND hitting a 800TB database?
 
What database software are you going to use?  There are only a few that I am aware that could handle a 800TB database.

Can Apache act as a HTTP server for this? Sure, but not on a single computer (unless you are planning to get something like a 54-way z9).

If you are new and these are really the specs, you are way over your head.
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
shang3000Author Commented:
hi giltjr,
please help me to answer those questions:
- what you suggest as a practical and low cost solution for this to be done perfectly????

- how meany concurrent session does the single computer holding apache can take per second
0
 
shang3000Author Commented:
- what is the database you prefer for that mission from those you are aware of
0
 
giltjrCommented:
Define low cost?  An environment that can handle 5,000,000 concurrent users a second and a 800TB database will be in the millions of dollars US dollars.

--> - how meany [sic] concurrent session does the single computer holding apache can take per second

It depends on what they are doing.  An 4-way Intel box running Apache that is just serving up a small (50K) static web page, I would guess 50,000 to 100,000 hits per second.  However if you are doing some type of scripting/application programming that is querying a database, and then generating a dynamic HTML page that is 500K, it might be able to do 5,000 to 10,000 hits per second.

I would say off hand DB2, Redbrick, TeraData, and Oracle are the database systems that could actually support a 800TB database without too much problems and decent performance.

I would double check your requirements.  Based on what you have said so far, this is a large scale enterprise type environment.  I am talking either MANY, MANY (hundreds) of small servers, or a few LARGE serves for redundency, load balancers, multiple SAN boxes, high speed network connections.  If you are doing this over the Internet you need at least 4 DS3 connections.

Based on the original number you would need a 3 tier environment:  Web servers, application servers, database servers.

0
 
shang3000Author Commented:
hi giltjr,
>>>Based on the original number you would need a 3 tier environment:  Web
>>>servers,application servers, database servers
how can I separate the application server from web server on case I'm using scripting languages (perl, python, php, ruby)
it have a module embedded inside apache I can't separate it
(please correct me if I'm wrong)
0
 
shang3000Author Commented:
also there is  another question some guys told me that mysql can handle that 800TB database. is that correct or what please give me your openion

Best Regards
HG
0
 
giltjrCommented:
You can seperate it.  You have the web server with light code that send request back to the application server level.  Three tier is the norm.

As for MySQL did they say 800TB or 800GB?

What is the purpose of the database?  Decisions support, that is a data warehouse, or is it for online transactions?

You may want to look at:

http://www.wintercorp.com/VLDB/2005_TopTen_Survey/TopTenWinners_2005.asp

and query for the "top" databases in size.   At 800TB you are larger than the BIGGEST either a DSS/DW or OLTP database, by far.  The majority of the companies in the world don't even have 800TB of data on DASD, yet alone a database that is 800TB.

Again, if this is truely a 5,000,000 per second users and 800TB database, this is not something a newbie should be doing.  In fact this would be something that is handled by a GROUP of highly skilled and experienced people.


I am not sure where you are from, but 5,000,000 users hitting you every second means that about 1/2 of the population of New York City is hitting your server EVERY SECOND.  I would venture to say that there is NO system in the world that is currently doing this.
0
 
shang3000Author Commented:
hi giltjr,
>>>As for MySQL did they say 800TB or 800GB?
sorry for the mistake it's 820 gb preciecly does mysql support that?

0
 
giltjrCommented:
MySQL can handle a DB of 800GB.  That is still a HUGE database.  Does this mean that you have 4 million rows at 20KB per row or 40 million rows at 2 KB?

It it still 5 million users per second, you are talking about a minumum of 5 million hits/transactions/queries per second.  Just to give you an idea of how big that is, in April of this year Doubleclick announced they were handling a peak of 160,000 hits per second.  They are one of the largest advertising servers in the world.

     ref: http://www.democraticmedia.org/jcblog/?p=253

I can't find any real hard facts about Google, if the the largest search engine, it has to be one of the top 2, but I have seen number anywhere from 25 to 100 million hits a DAY.

At 5 million a second you would hit that in 5 to 20 seconds.  IIRC Google has 30 different server farms with hundreds of servers in each farm.

Most sites would be lucky to get 5 million hits a day, heck they might be lucky to get 5 million a month.

You need to verify all the numbers.  You need to find out things like:

1) Total number of users for the system.
2) Total number of concurrent users.
3) Avg. number of queries/transactions per second.
4) Avg. number of rows fetched per query.
5) Avg. page size in bytes returned to the user per query.
6) How much static content per page.
7) How much dynamic content per page.
8) How much dynamic content is formated on the server side.
9) How much dynamic content is formated on the client side.
10) Type of security, if any required.

These are things just to start.  This sounds like a HUGE project.
0
 
shang3000Author Commented:
Hi giltjr,
>>>MySQL can handle a DB of 800GB.  That is still a HUGE database.  Does this >>>mean that you have 4 million rows at 20KB per row or 40 million rows at 2 KB?
here is the equation and I want the answer from you
40000000 record * 20 kilobyte per record
please give me your answer
0
 
giltjrCommented:
Opps, I never did the math, I assumed that rixlabs had done it.  The DB is about 800GB, which although is much, much smaller than 800TB, is still a HUGE database.

Again, if you look at the Winter Corp site they are talking about some of the top databases being in the 100GB range.

You still have to look at the issues I put in my last post.  This is still appears to be a HUGE system with lots of data and trying to execute tons of transactions.

Nothing than somebody new should be dealing with to design.
0
 
giltjrCommented:
To the others that are helping with this, please also see.  Basically a duplicate question.


http://www.experts-exchange.com/OS/Linux/Q_22716697.html
http://www.experts-exchange.com/Database/MySQL/Q_22716690.html
0
 
shang3000Author Commented:
hi giltjr,
thanks alot for reply  and assistance
please if you have links , articles or tutorials about that issue can you please send it to me
0
 
giltjrCommented:
It really tough provide links.  This is really something that you learn by working with smaller sites and moving up to larger sites, lots of training, and working with other that have experience in large sites.  You also must know what the application is doing.

Basically you are talking about capacity planning.  In some instances it is not as much the software as it is the hardware and how much hardware.  

If it were me working on this some of the 1st steps would be to get the answers to the 10 questions I listed before and build a small enviroment to run some tests.  A single DB server, a single AP server, and a web server .  Then start running test to see how much resources are used where, crank up as many simulated users as I could to see what breaks when, the DB server, the AP server, or the Web server.  Then build another small enviroment with 2 DB servers, 4-6 AP servers, and 2-3 Web servers.  Figure where that breaks, then see what the reqirment sare.

If 2 DB, 4-6 AP, and 2-3 Web Servers can handle 100,000 hits per seconds, then you need something that is 50 times larger.
0
 
shang3000Author Commented:
hi giltjr,
thanks alot for your help man
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

  • 9
  • 7
Tackle projects and never again get stuck behind a technical roadblock.
Join Now