Link to home
Start Free TrialLog in
Avatar of shang3000
shang3000

asked on

does this project can be done on apache

Hi all,
I need your help as proffessionals in that :
I got thoes requirement from the system analyst
- estimated number of concurrent users per second on the web application is 5 million user
- estimated number of records per database is 40 million record
- record size is 20 kilo byte

I'm responsible for technical specification but I'm newbie
- does apache can take that huge ammount of traffic and number of users per second
 if so how about the stability and security and performance and speed in that case is it good enough or it 'll be poor???
best Regards
HG
Avatar of rixlabs
rixlabs

With this ammount of data and user is better to create a load balanced cluster of web servers...

the size of your database disk is something like 800 TB so you need to plan a storage server
Avatar of shang3000

ASKER

hi rixlabs,
thanks for reply
can you give me more details :
sites, factsheet, tutorial, article, link or book that can help me to do that
thanks in advance
Best Regards
HG
Avatar of giltjr
Wait, you plan to have 5 million ( that is 5,000,000) unique users PER SECOND hitting a 800TB database?
 
What database software are you going to use?  There are only a few that I am aware that could handle a 800TB database.

Can Apache act as a HTTP server for this? Sure, but not on a single computer (unless you are planning to get something like a 54-way z9).

If you are new and these are really the specs, you are way over your head.
hi giltjr,
please help me to answer those questions:
- what you suggest as a practical and low cost solution for this to be done perfectly????

- how meany concurrent session does the single computer holding apache can take per second
- what is the database you prefer for that mission from those you are aware of
Define low cost?  An environment that can handle 5,000,000 concurrent users a second and a 800TB database will be in the millions of dollars US dollars.

--> - how meany [sic] concurrent session does the single computer holding apache can take per second

It depends on what they are doing.  An 4-way Intel box running Apache that is just serving up a small (50K) static web page, I would guess 50,000 to 100,000 hits per second.  However if you are doing some type of scripting/application programming that is querying a database, and then generating a dynamic HTML page that is 500K, it might be able to do 5,000 to 10,000 hits per second.

I would say off hand DB2, Redbrick, TeraData, and Oracle are the database systems that could actually support a 800TB database without too much problems and decent performance.

I would double check your requirements.  Based on what you have said so far, this is a large scale enterprise type environment.  I am talking either MANY, MANY (hundreds) of small servers, or a few LARGE serves for redundency, load balancers, multiple SAN boxes, high speed network connections.  If you are doing this over the Internet you need at least 4 DS3 connections.

Based on the original number you would need a 3 tier environment:  Web servers, application servers, database servers.

hi giltjr,
>>>Based on the original number you would need a 3 tier environment:  Web
>>>servers,application servers, database servers
how can I separate the application server from web server on case I'm using scripting languages (perl, python, php, ruby)
it have a module embedded inside apache I can't separate it
(please correct me if I'm wrong)
also there is  another question some guys told me that mysql can handle that 800TB database. is that correct or what please give me your openion

Best Regards
HG
You can seperate it.  You have the web server with light code that send request back to the application server level.  Three tier is the norm.

As for MySQL did they say 800TB or 800GB?

What is the purpose of the database?  Decisions support, that is a data warehouse, or is it for online transactions?

You may want to look at:

http://www.wintercorp.com/VLDB/2005_TopTen_Survey/TopTenWinners_2005.asp

and query for the "top" databases in size.   At 800TB you are larger than the BIGGEST either a DSS/DW or OLTP database, by far.  The majority of the companies in the world don't even have 800TB of data on DASD, yet alone a database that is 800TB.

Again, if this is truely a 5,000,000 per second users and 800TB database, this is not something a newbie should be doing.  In fact this would be something that is handled by a GROUP of highly skilled and experienced people.


I am not sure where you are from, but 5,000,000 users hitting you every second means that about 1/2 of the population of New York City is hitting your server EVERY SECOND.  I would venture to say that there is NO system in the world that is currently doing this.
hi giltjr,
>>>As for MySQL did they say 800TB or 800GB?
sorry for the mistake it's 820 gb preciecly does mysql support that?

MySQL can handle a DB of 800GB.  That is still a HUGE database.  Does this mean that you have 4 million rows at 20KB per row or 40 million rows at 2 KB?

It it still 5 million users per second, you are talking about a minumum of 5 million hits/transactions/queries per second.  Just to give you an idea of how big that is, in April of this year Doubleclick announced they were handling a peak of 160,000 hits per second.  They are one of the largest advertising servers in the world.

     ref: http://www.democraticmedia.org/jcblog/?p=253

I can't find any real hard facts about Google, if the the largest search engine, it has to be one of the top 2, but I have seen number anywhere from 25 to 100 million hits a DAY.

At 5 million a second you would hit that in 5 to 20 seconds.  IIRC Google has 30 different server farms with hundreds of servers in each farm.

Most sites would be lucky to get 5 million hits a day, heck they might be lucky to get 5 million a month.

You need to verify all the numbers.  You need to find out things like:

1) Total number of users for the system.
2) Total number of concurrent users.
3) Avg. number of queries/transactions per second.
4) Avg. number of rows fetched per query.
5) Avg. page size in bytes returned to the user per query.
6) How much static content per page.
7) How much dynamic content per page.
8) How much dynamic content is formated on the server side.
9) How much dynamic content is formated on the client side.
10) Type of security, if any required.

These are things just to start.  This sounds like a HUGE project.
Hi giltjr,
>>>MySQL can handle a DB of 800GB.  That is still a HUGE database.  Does this >>>mean that you have 4 million rows at 20KB per row or 40 million rows at 2 KB?
here is the equation and I want the answer from you
40000000 record * 20 kilobyte per record
please give me your answer
Opps, I never did the math, I assumed that rixlabs had done it.  The DB is about 800GB, which although is much, much smaller than 800TB, is still a HUGE database.

Again, if you look at the Winter Corp site they are talking about some of the top databases being in the 100GB range.

You still have to look at the issues I put in my last post.  This is still appears to be a HUGE system with lots of data and trying to execute tons of transactions.

Nothing than somebody new should be dealing with to design.
hi giltjr,
thanks alot for reply  and assistance
please if you have links , articles or tutorials about that issue can you please send it to me
ASKER CERTIFIED SOLUTION
Avatar of giltjr
giltjr
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
hi giltjr,
thanks alot for your help man