?
Solved

8 Way P III Xeon 700MHz Processor vs 2 Way Intel Xeon 2.4MHz Processor?

Posted on 2006-04-04
16
Medium Priority
?
724 Views
Last Modified: 2008-08-16
I currently have a ProLiant DL760 with 8 700MHz PIII Xeon processors in it and need to know if a Proliant ML530 G2 with Dual 2.4MHz Intel Xeon Processors would be better?  I have been unable to find any type of performance comparisons on these two types of processor when compared to each other.  Anyone have any idea which server would do a better job of handling W2K3 with SQL2005?  The RAM on the DL760 is 16GB while the ML530 would have 8GB.
0
Comment
Question by:LandShark
  • 5
  • 4
  • 3
  • +4
16 Comments
 
LVL 9

Expert Comment

by:bigjimbo813
ID: 16373495
DL760 seems like a better choice if you are running those apps.

One thing you need to consider is the warranty. If its going to expire soon, then i would say go with the newer one with a longer warranty.
0
 
LVL 32

Expert Comment

by:jhance
ID: 16373532
It's VERY difficult to make such comparisons.  So much of this depends on the characteristics of what you are running on your system.  An application that can effectively use each of the 8 CPUs in the 8-way server will run circles around the faster, but only 2-way system.  Alternatively, some applications which don't multiprocess well will grind to a halt on the 8-way server, run so-so on the 2-way but run best on a 1 CPU box.

Take a look at what your processing load is composed of.  Run some benchmarks of your own stuff and see what things make a difference.  There is no quick/easy answer here to which is "better".  It will only come with some research and experimentation.
0
 
LVL 5

Expert Comment

by:shankshank
ID: 16374468
you may think that more processors are better, but again it does depend on the setup. I think Windows 2003 enterprise edition may need to be install to support that many processors, and sql licensing changes also, and can be pricey, so i'd stick with the 2 processors
0
 The Evil-ution of Network Security Threats

What are the hacks that forever changed the security industry? To answer that question, we created an exciting new eBook that takes you on a trip through hacking history. It explores the top hacks from the 80s to 2010s, why they mattered, and how the security industry responded.

 
LVL 12

Assisted Solution

by:GinEric
GinEric earned 600 total points
ID: 16374765
The biggest factor is the obsolescence date: "http://h10010.www1.hp.com/wwpc/pscmisc/vac/us/en/ss/proliant/dl760g2-qanda.html" which means it's obsolete as of last month.

HP is pitching the ProLiant DL580 G3 currently.

The Proliant ML530 G2 would seem to be the lessor.

HP has a common problem with Intel and other manufacturers: hard to find specifications.

Specs should be right up front and easy to find, like those on a Chevrolet car when you walk into a showroom and look at the sticker price sheet, i.e., what's under the hood.

As far as performance based on highly parallel systems, they're only as good as the specifications.  "Highly parallel, multiprocessor, multitasking systems" means parallel traces to and from all modules [CPM, IOM, MCM], one processor as the Master, able to execute allowable tasks and processes with some independence regarding chronology of calls to execute such tasks and processes.  That simply means that the first program called is not necessarily the first one to conclude its work.  Any dependency between programs, tasks, and processes [modules of executable machine code] may tie up conclusion until the chain of results are in [End Of Task or EOT].

Any benchmark must look at Start Of Task [SOT] and End Of Task [EOT] to be an effetive evaluation of comparative performance.  Task that run without regard to synchronized times are called Task Independent Runners [TIR].  If you're operating systems doesn't know what a TIR is, it's most likely it will fudge results anyway.

This is all generally considered to be Distributed Processing.  One or two processors should also be dedicated diagnostic slaves; that means should any error occur, it can be simulated step by step on a Maintenance Diagnostic Buss by the Master Processor which can then single step a slave processor and recreate the conditions existing at the time of the failure, such as a crash or other "timing problem."  With 8 parallel processors, if there are no associated actual physical Input/Output Modules [IOM] and the IOM is replaced by various controllers, such as PCI, APCI, DMA, and the like, then system performance may seem super fast, but will, in actuality, be probably 20 times less effective than with real IOM's.

And the most critical path, the Interrupt Buss, not being a true 64-bit Interrupt Buss, means that it's going to be slower than the true desing concept of "Highly parallel, multiprocessor, multitasking systems" anyway.  How do you control timely interrupt of 8 parallel processors with emulated software interrupt systems?  The answer is "a whole lot slower than with real, physical, and completely separate 64-bit Interrupts."

A design that complies with the above will, generally, be hundreds of times faster than those that implement these concepts with software emulations, such as plug and play, and other devices that conflict with the design specifications for true 64-bit systems.

Which is why the specification of the system is the most important factor and needs to be known, from and engineering technical document.

You should also look at other microprocessor based systems specifications.

The specification for a true system, by the way, would read as an "8x8" or "8x8x8" system, that is, 8 processors, 8 IOM's, and 8 MCM's.  That implies 8 physical busses between all modules.  Which, if in the host controller world of PCI, DMA, APCI, and such, means 8 of each also.  In short, you need 8 of everything that is core to the basic definition of what a computer system is.  This doesn't inlcude peripherals, such as drives and other devices, simply the old time basics a Central Processing Module, an Input/Output Module, a Memory Control Module, a Diagnostic Control Module, and Peripherals [aka 'devices'].

The cabling or printed circuit traces between all core computer systems parts is a given and should be equivalent to the buss width of the system itself.  Which explains why the physical Interrupt Buss has to be the same width.

If you're going to go with HP, you should at least consider the ProLiant DL580 G3.
0
 
LVL 57

Assisted Solution

by:giltjr
giltjr earned 900 total points
ID: 16376074
Put put database on each and run benchmarks.  Pick the winner.

More RAM means more caching/buffer, means less IO, means better performance (NORMALLY).  If your database is large enough to take advantage of 16 GB of RAM this will be a benifit.

If you have a lot of queries that don't take much CPU, then more slower processors are better (if the hardware is designed correctly) because you can run them concurrently.  If you have a few queries that need a lot of CPU, then fewer faster processors would be better.

Go to www.tcp.org and look at the TCP-C benchmark results by hardware vendor they have results for DL760 with 8x900Mhz, ML530 with 1x2.4Ghz and ML530 1x1 Ghz.  Based on those results a DL760 with 8x700Mhz should perform better than a ML530 2x2.4 Ghz on the TCP-C benchmarks.  However how your application may perform could be totally different.
0
 
LVL 8

Expert Comment

by:dlongan
ID: 16377551
One thing the gets overlooked is, the speed and architure of the memory and i/o buses (system bus, disk subsystem...).  The 700 mhz cpu is quite old, thus we can assume the memory and i/o buses are quite old also.  I would best guess the newer dual 2.4ghz system will out perform the older 8way.

Most servers are i/o bound not cpu.
0
 
LVL 57

Accepted Solution

by:
giltjr earned 900 total points
ID: 16377704
Depends on the database design and size.  On the TCP-C benchmark (not that I really beleive in these, but they are the only thing around) a 8x900Mhz beat a 1x2.4Ghz by almost a factor of 4 (69,169 vs. 17,659).  Now the 2.4 Ghz machine only had 1 or 2 GB of RAM compared to the 16 GB of the 8x900.  Now he only has a 700 Mhz, but if we assume 70% of a 900 that is still more than double of 1x2.4Ghz system.

Both of these boxes have 100 Mhz PCI-X slots (DL760 as 10 and the ML530 has 7) however the ML530 has a 400 Mhz bus vs. 100 MHz for the DL760.  This will make a big difference if you have to pump a lot of stuff through memory.

LandShark, you told us what boxes you are look at for the DBMS.  What are you planning to use for the application server?
0
 
LVL 12

Expert Comment

by:GinEric
ID: 16378567
"Most servers are i/o bound not cpu." more like "Most servers are external host controller bound not cpu." as there appears to be no I/O in any PC.
0
 
LVL 8

Expert Comment

by:dlongan
ID: 16380775
giltjr,

Your absolutely correct it does depend on the design and size of the database...

But as you pointed out the 2-way box has a 400mhz bus versus 100mhz.  And what about the disk subsystem?  SCSI or SATA? Which version?, Memory configuration? RAID and how it is setup?

My point in all of this is the processor speed and quanity is really only one factor to look at.  You can have the fastest cpu but only a pin hole to pump the data in and out of the system...
0
 
LVL 12

Expert Comment

by:GinEric
ID: 16383349
I still do not understand why people simply refuse to understand that serial is slower than parallel.  No, you do not want SATA!  You want PATA, IDE-64, PCI-64, or SCSI [which is now 160 bits wide and can handle full multiplexed of two complete 64-bit busses simultaneously].

SATA is the pin hole you were talking about.
0
 
LVL 8

Expert Comment

by:dlongan
ID: 16383615
GinEric,

Take a chill pill, I don't anybody has recommended SATA, it was only showing how there ARE DIFFERENT configurations...
0
 
LVL 12

Assisted Solution

by:GinEric
GinEric earned 600 total points
ID: 16384382
I don't take pills, those are for the habitually dependent.  As for the database, in a distributed system using Vector Indexed Array Relative records, there should be no limit to size nor any effect on its efficiency, since every transaction is reduced to two simultaneous address resolutions, even across networks, treating the entire network as a virtual memory.  This, of course, is limited by the address boundaries at various points, such as the 30 gig boundary, the 180 gig boundary, and terabyte boundary.  It should be seamless, however, with 64-bit double-precision [really 96 bits of actual address couple] addressing as is found in modern systems.  I think dual Xeon's can handle that though.

The speed is dependent more on the architecture of the busses; multiplexed simulataneous multilayered [on the various levels of the laminated printed circuit] and independence of operation through PCI, APCI, and other host controllers which allows up to 8 simultaneous operations between any given set of PCI or other buss [the I/O busses between IOM and host control channels] synchronized with and DMA controller [the Memory Control Module] which bypass entirely any required intervention of the processor, except that of initializing the transaction and marking it complete when finished.

400 mHz buss is already multiplexed at twice the maximum speed of a 200 mHz rate; an 800 mHz buss is simply four 200 mHz busses operating in parallel.  See the appropriate Intel, AMD, or other specifications.  It's done by multiplexing and time-splicing the various clocks controlling data flow.

The rating of the clocks, such as 2.4 gHz, is merely the resonant crystal frequency and not the true speed at all.  This is always divided down to accomodate the processor and other components.  It's a very deceptive system of rating cpu speed.

To get the three year warranty and support, you may have to go with the latest, which is the ProLiant DL580 G3 currently or something similar and an upgrade to a current server operting system to manage the older systems you have.
0
 
LVL 57

Expert Comment

by:giltjr
ID: 16384581
What?  Could you point me to something I could read that would explain what your 1st paragraph is saying.   I understand the rest of the paragraphs, but that 1st one has me stumped.
0
 
LVL 12

Expert Comment

by:GinEric
ID: 16388222
The Intel and AMD design diagrams along with the programmers reference manual should explain most of paragraph 1.  Vector Indexed Arrays are merely arrays that use Matrix vector indirect referencing and addressing; that is, a pointer or descriptor points to another pointer or descriptor in a table of n-dimensions.  Much like Matrix algebra, the algebraix sum of the two vector products is the absolutel address, be that onboard, offboard, on hard drive or somewhere on the network [virtual space].  This has been recently implemented in the hardware design.  So, your array pointers can be summed down to two vectors, much like double sideband suppressed radio transmission and encoding.  Processing the two vectors simultaneously results in a one clock conversion of data request to absolute address for fetches and stores.  This is then kept in an associative array until a burst is issued, out of RAM, to move results from RAM to permanent storage and back.

Arrayrows, data descriptors [the whole explanation], various pointers, but mostly hardware design concepts which affect data speed throughput and so on.  The first boundary at present is the backwards compatible 64-bit to 32-bit boundary; if a 32-bit fetch on an address greater than two gig [as in 64-bits of address couple] is executed by a 32-bit driver, there is a seven clock timing required for the fetch while the system aligns the address.  If, on the other hand, a 64-bit routine executes, and it can fetch or store on a single clock, and the hardware has no time to synchronize, then the fetch or store often goes awry because it's so fast that the data is thought to exist before the command is executed in a look-ahead preprocessing system.  Systems today claim to be of this type.  The result has been quite a few errors as the programmers of drivers were not quite aware of just how fast these systems are.

Today's systems, processors mostly, effectively mask an operation to zero clock time unless specifically told otherwise.  This leads to race conditions and errors.

So, if your board has 32-bit busses and 64-bit busses and any host controller, PCI, DMA, IDE, is told to fetch a 64-bit address, it takes time to calculate what it thinks will be a double precision word, but if that word is supplied from a 32-bit driver to a 64-bit buss, the result is there the instant it receives the address couple and before data can be initialized, thus, the result if "unpredictable" and errors ensue.  Usually, one bit correctible errors, but enough and it will cause an error processing and sometimes a crash.

Intel has special operators for this, causing re-entrant code restarts at the microprocessor level, built in hardware operators.  If the drivers do not match this instruction set, performance is mandatorily reverted to the backwards compatible 32-bit instruction set.  Which can result in timing errors.

Database Theory is about equivalent to Matrix Theory.  You only need two vectors to locate any item in an database, regardless of the number of dimensions.  So Database Theory might also be a good place to look for explanations.  Then there would be Matrix Theory itself.

2^64 is beyond any address you will find in this time and space on this planet.  Therefore, the operation of such addressing applies effectively to all known addresses.  Even at 48-bits [10^16] in single precision the address is beyond all current addressing.  [(10,000,000,000,000,000)-1] {10 quintobytes, 10 billion terabytes} decimal or:
[(1 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000)-1] binary as 48 bits of address couple.

If you look at it, all you really have to do is come up with a schema to encode this humongous address into two half-sized vectors to find anything in that array size, regardless, again, of number of dimensions.

And if you can do this at the hardware design level, that is to let the hardware calcuate the vector address, then the hardware can do it in either one clock, or, with look-ahead logic design, in effectively zero clocks.

Basically instantaneous access to database records.  Not counting the time to send the data across networks and wires, strictly regarding the effective address request to data result time for a data storage scheme or device.

You could also look up Federal and Special Systems Design Architecture for computers since about 1975, when this schema began and was perfected in 1978 or thereabouts and implemented in 64-bit mainframe systems.

It's actually very simple and very elegant mathematically: any object in any n-dimensional frame of reference can be relationally located by only two variables if and only if those two variables are n-sum encoded.

Meaning, you only need to know two things about something to find it.  And finding it is as instantaneous as knowing the two things about it.

Of course, physical devices take real time to produce results, but the time is optimized when Matrix Theory is applied to them and the fastest possible rate of storage and retrieval is thereby effected.  You can also call it N-Dimensional Theory.

I know it may be hard to grasp, but it is real and it does exist and is used currently.  Microprocessors have just caught up to this design, so it's a little new to everyone.

Whatever Intel or AMD or Citrix call it, vector indirect reference or vector indirect addressing, it still boils down to the same thing, manipulating arrays and databases using dual indexed pointers and/or descriptors.  You can also look up these terms and figure them out in depth.

But remember, it's a hardware concept and desing which then begets software techniques because the hardware is always faster than the software.

Similar to: If I know one side and the included angle, for a right triangle I know the whole triangle.  Do you see this?  That is the basis of vectoring.

I hope so.
0
 
LVL 57

Expert Comment

by:giltjr
ID: 16388775
Ah,  I do understand the memory addressing and matrix theory.  The part that got me from your original statmentment was "... even across networks, treating the entire network as a virtual memory."  I was trying to figure out how that would work.  Some of what you wrote is above my head, but I think I get the general idea.  

But that would require a central box that has the complete database indexed in memory.  Would it not?

But I think we are getting a bit off topic and I doubt if his database is that large, but it could be applied to a single box with a small database.  I have worked with a couple of application that sounds like they did that, built the database base in memory at boot time.  These were small databases as they were on boxes with about 4GB of RAM and they were 99% reads with only a few (3-4) updates a day.
0
 

Author Comment

by:LandShark
ID: 16390883
Thanks to all for your comments and suggestions.  I wanted to get some idea on some type of benchmarks plus additional insight on what else to consider besides CPU and Memory.  This will be our Disaster Recovery server and will be hosting a SQL DB of 450GB.  Thanks to GenEric and giltjr for your help and comments.  
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

On Beyond Tools A conversation I recently had with the DevOps manager of a major online retailer really made me think about DevOps monitoring tools (https://www.onpage.com/devops-incident-management-tool/). The manager and I discussed how sever…
This article will show how Aten was able to supply easy management and control for Artear's video walls and wide range display configurations of their newsroom.
Exchange organizations may use the Journaling Agent of the Transport Service to archive messages going through Exchange. However, if the Transport Service is integrated with some email content management application (such as an anti-spam), the admin…
Look below the covers at a subform control , and the form that is inside it. Explore properties and see how easy it is to aggregate, get statistics, and synchronize results for your data. A Microsoft Access subform is used to show relevant calcul…

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question