We help IT Professionals succeed at work.

We've partnered with Certified Experts, Carl Webster and Richard Faulkner, to bring you a podcast all about Citrix Workspace, moving to the cloud, and analytics & intelligence. Episode 2 coming soon!Listen Now


8 Way P III Xeon 700MHz Processor vs 2 Way Intel Xeon 2.4MHz Processor?

LandShark asked
Medium Priority
Last Modified: 2008-08-16
I currently have a ProLiant DL760 with 8 700MHz PIII Xeon processors in it and need to know if a Proliant ML530 G2 with Dual 2.4MHz Intel Xeon Processors would be better?  I have been unable to find any type of performance comparisons on these two types of processor when compared to each other.  Anyone have any idea which server would do a better job of handling W2K3 with SQL2005?  The RAM on the DL760 is 16GB while the ML530 would have 8GB.
Watch Question

DL760 seems like a better choice if you are running those apps.

One thing you need to consider is the warranty. If its going to expire soon, then i would say go with the newer one with a longer warranty.

It's VERY difficult to make such comparisons.  So much of this depends on the characteristics of what you are running on your system.  An application that can effectively use each of the 8 CPUs in the 8-way server will run circles around the faster, but only 2-way system.  Alternatively, some applications which don't multiprocess well will grind to a halt on the 8-way server, run so-so on the 2-way but run best on a 1 CPU box.

Take a look at what your processing load is composed of.  Run some benchmarks of your own stuff and see what things make a difference.  There is no quick/easy answer here to which is "better".  It will only come with some research and experimentation.
you may think that more processors are better, but again it does depend on the setup. I think Windows 2003 enterprise edition may need to be install to support that many processors, and sql licensing changes also, and can be pricey, so i'd stick with the 2 processors
The biggest factor is the obsolescence date: "http://h10010.www1.hp.com/wwpc/pscmisc/vac/us/en/ss/proliant/dl760g2-qanda.html" which means it's obsolete as of last month.

HP is pitching the ProLiant DL580 G3 currently.

The Proliant ML530 G2 would seem to be the lessor.

HP has a common problem with Intel and other manufacturers: hard to find specifications.

Specs should be right up front and easy to find, like those on a Chevrolet car when you walk into a showroom and look at the sticker price sheet, i.e., what's under the hood.

As far as performance based on highly parallel systems, they're only as good as the specifications.  "Highly parallel, multiprocessor, multitasking systems" means parallel traces to and from all modules [CPM, IOM, MCM], one processor as the Master, able to execute allowable tasks and processes with some independence regarding chronology of calls to execute such tasks and processes.  That simply means that the first program called is not necessarily the first one to conclude its work.  Any dependency between programs, tasks, and processes [modules of executable machine code] may tie up conclusion until the chain of results are in [End Of Task or EOT].

Any benchmark must look at Start Of Task [SOT] and End Of Task [EOT] to be an effetive evaluation of comparative performance.  Task that run without regard to synchronized times are called Task Independent Runners [TIR].  If you're operating systems doesn't know what a TIR is, it's most likely it will fudge results anyway.

This is all generally considered to be Distributed Processing.  One or two processors should also be dedicated diagnostic slaves; that means should any error occur, it can be simulated step by step on a Maintenance Diagnostic Buss by the Master Processor which can then single step a slave processor and recreate the conditions existing at the time of the failure, such as a crash or other "timing problem."  With 8 parallel processors, if there are no associated actual physical Input/Output Modules [IOM] and the IOM is replaced by various controllers, such as PCI, APCI, DMA, and the like, then system performance may seem super fast, but will, in actuality, be probably 20 times less effective than with real IOM's.

And the most critical path, the Interrupt Buss, not being a true 64-bit Interrupt Buss, means that it's going to be slower than the true desing concept of "Highly parallel, multiprocessor, multitasking systems" anyway.  How do you control timely interrupt of 8 parallel processors with emulated software interrupt systems?  The answer is "a whole lot slower than with real, physical, and completely separate 64-bit Interrupts."

A design that complies with the above will, generally, be hundreds of times faster than those that implement these concepts with software emulations, such as plug and play, and other devices that conflict with the design specifications for true 64-bit systems.

Which is why the specification of the system is the most important factor and needs to be known, from and engineering technical document.

You should also look at other microprocessor based systems specifications.

The specification for a true system, by the way, would read as an "8x8" or "8x8x8" system, that is, 8 processors, 8 IOM's, and 8 MCM's.  That implies 8 physical busses between all modules.  Which, if in the host controller world of PCI, DMA, APCI, and such, means 8 of each also.  In short, you need 8 of everything that is core to the basic definition of what a computer system is.  This doesn't inlcude peripherals, such as drives and other devices, simply the old time basics a Central Processing Module, an Input/Output Module, a Memory Control Module, a Diagnostic Control Module, and Peripherals [aka 'devices'].

The cabling or printed circuit traces between all core computer systems parts is a given and should be equivalent to the buss width of the system itself.  Which explains why the physical Interrupt Buss has to be the same width.

If you're going to go with HP, you should at least consider the ProLiant DL580 G3.

Not the solution you were looking for? Getting a personalized solution is easy.

Ask the Experts
Top Expert 2014
Put put database on each and run benchmarks.  Pick the winner.

More RAM means more caching/buffer, means less IO, means better performance (NORMALLY).  If your database is large enough to take advantage of 16 GB of RAM this will be a benifit.

If you have a lot of queries that don't take much CPU, then more slower processors are better (if the hardware is designed correctly) because you can run them concurrently.  If you have a few queries that need a lot of CPU, then fewer faster processors would be better.

Go to www.tcp.org and look at the TCP-C benchmark results by hardware vendor they have results for DL760 with 8x900Mhz, ML530 with 1x2.4Ghz and ML530 1x1 Ghz.  Based on those results a DL760 with 8x700Mhz should perform better than a ML530 2x2.4 Ghz on the TCP-C benchmarks.  However how your application may perform could be totally different.
dlonganDirector of IT

One thing the gets overlooked is, the speed and architure of the memory and i/o buses (system bus, disk subsystem...).  The 700 mhz cpu is quite old, thus we can assume the memory and i/o buses are quite old also.  I would best guess the newer dual 2.4ghz system will out perform the older 8way.

Most servers are i/o bound not cpu.
Top Expert 2014
Depends on the database design and size.  On the TCP-C benchmark (not that I really beleive in these, but they are the only thing around) a 8x900Mhz beat a 1x2.4Ghz by almost a factor of 4 (69,169 vs. 17,659).  Now the 2.4 Ghz machine only had 1 or 2 GB of RAM compared to the 16 GB of the 8x900.  Now he only has a 700 Mhz, but if we assume 70% of a 900 that is still more than double of 1x2.4Ghz system.

Both of these boxes have 100 Mhz PCI-X slots (DL760 as 10 and the ML530 has 7) however the ML530 has a 400 Mhz bus vs. 100 MHz for the DL760.  This will make a big difference if you have to pump a lot of stuff through memory.

LandShark, you told us what boxes you are look at for the DBMS.  What are you planning to use for the application server?

"Most servers are i/o bound not cpu." more like "Most servers are external host controller bound not cpu." as there appears to be no I/O in any PC.
dlonganDirector of IT


Your absolutely correct it does depend on the design and size of the database...

But as you pointed out the 2-way box has a 400mhz bus versus 100mhz.  And what about the disk subsystem?  SCSI or SATA? Which version?, Memory configuration? RAID and how it is setup?

My point in all of this is the processor speed and quanity is really only one factor to look at.  You can have the fastest cpu but only a pin hole to pump the data in and out of the system...

I still do not understand why people simply refuse to understand that serial is slower than parallel.  No, you do not want SATA!  You want PATA, IDE-64, PCI-64, or SCSI [which is now 160 bits wide and can handle full multiplexed of two complete 64-bit busses simultaneously].

SATA is the pin hole you were talking about.
dlonganDirector of IT


Take a chill pill, I don't anybody has recommended SATA, it was only showing how there ARE DIFFERENT configurations...
I don't take pills, those are for the habitually dependent.  As for the database, in a distributed system using Vector Indexed Array Relative records, there should be no limit to size nor any effect on its efficiency, since every transaction is reduced to two simultaneous address resolutions, even across networks, treating the entire network as a virtual memory.  This, of course, is limited by the address boundaries at various points, such as the 30 gig boundary, the 180 gig boundary, and terabyte boundary.  It should be seamless, however, with 64-bit double-precision [really 96 bits of actual address couple] addressing as is found in modern systems.  I think dual Xeon's can handle that though.

The speed is dependent more on the architecture of the busses; multiplexed simulataneous multilayered [on the various levels of the laminated printed circuit] and independence of operation through PCI, APCI, and other host controllers which allows up to 8 simultaneous operations between any given set of PCI or other buss [the I/O busses between IOM and host control channels] synchronized with and DMA controller [the Memory Control Module] which bypass entirely any required intervention of the processor, except that of initializing the transaction and marking it complete when finished.

400 mHz buss is already multiplexed at twice the maximum speed of a 200 mHz rate; an 800 mHz buss is simply four 200 mHz busses operating in parallel.  See the appropriate Intel, AMD, or other specifications.  It's done by multiplexing and time-splicing the various clocks controlling data flow.

The rating of the clocks, such as 2.4 gHz, is merely the resonant crystal frequency and not the true speed at all.  This is always divided down to accomodate the processor and other components.  It's a very deceptive system of rating cpu speed.

To get the three year warranty and support, you may have to go with the latest, which is the ProLiant DL580 G3 currently or something similar and an upgrade to a current server operting system to manage the older systems you have.
Top Expert 2014

What?  Could you point me to something I could read that would explain what your 1st paragraph is saying.   I understand the rest of the paragraphs, but that 1st one has me stumped.

The Intel and AMD design diagrams along with the programmers reference manual should explain most of paragraph 1.  Vector Indexed Arrays are merely arrays that use Matrix vector indirect referencing and addressing; that is, a pointer or descriptor points to another pointer or descriptor in a table of n-dimensions.  Much like Matrix algebra, the algebraix sum of the two vector products is the absolutel address, be that onboard, offboard, on hard drive or somewhere on the network [virtual space].  This has been recently implemented in the hardware design.  So, your array pointers can be summed down to two vectors, much like double sideband suppressed radio transmission and encoding.  Processing the two vectors simultaneously results in a one clock conversion of data request to absolute address for fetches and stores.  This is then kept in an associative array until a burst is issued, out of RAM, to move results from RAM to permanent storage and back.

Arrayrows, data descriptors [the whole explanation], various pointers, but mostly hardware design concepts which affect data speed throughput and so on.  The first boundary at present is the backwards compatible 64-bit to 32-bit boundary; if a 32-bit fetch on an address greater than two gig [as in 64-bits of address couple] is executed by a 32-bit driver, there is a seven clock timing required for the fetch while the system aligns the address.  If, on the other hand, a 64-bit routine executes, and it can fetch or store on a single clock, and the hardware has no time to synchronize, then the fetch or store often goes awry because it's so fast that the data is thought to exist before the command is executed in a look-ahead preprocessing system.  Systems today claim to be of this type.  The result has been quite a few errors as the programmers of drivers were not quite aware of just how fast these systems are.

Today's systems, processors mostly, effectively mask an operation to zero clock time unless specifically told otherwise.  This leads to race conditions and errors.

So, if your board has 32-bit busses and 64-bit busses and any host controller, PCI, DMA, IDE, is told to fetch a 64-bit address, it takes time to calculate what it thinks will be a double precision word, but if that word is supplied from a 32-bit driver to a 64-bit buss, the result is there the instant it receives the address couple and before data can be initialized, thus, the result if "unpredictable" and errors ensue.  Usually, one bit correctible errors, but enough and it will cause an error processing and sometimes a crash.

Intel has special operators for this, causing re-entrant code restarts at the microprocessor level, built in hardware operators.  If the drivers do not match this instruction set, performance is mandatorily reverted to the backwards compatible 32-bit instruction set.  Which can result in timing errors.

Database Theory is about equivalent to Matrix Theory.  You only need two vectors to locate any item in an database, regardless of the number of dimensions.  So Database Theory might also be a good place to look for explanations.  Then there would be Matrix Theory itself.

2^64 is beyond any address you will find in this time and space on this planet.  Therefore, the operation of such addressing applies effectively to all known addresses.  Even at 48-bits [10^16] in single precision the address is beyond all current addressing.  [(10,000,000,000,000,000)-1] {10 quintobytes, 10 billion terabytes} decimal or:
[(1 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000)-1] binary as 48 bits of address couple.

If you look at it, all you really have to do is come up with a schema to encode this humongous address into two half-sized vectors to find anything in that array size, regardless, again, of number of dimensions.

And if you can do this at the hardware design level, that is to let the hardware calcuate the vector address, then the hardware can do it in either one clock, or, with look-ahead logic design, in effectively zero clocks.

Basically instantaneous access to database records.  Not counting the time to send the data across networks and wires, strictly regarding the effective address request to data result time for a data storage scheme or device.

You could also look up Federal and Special Systems Design Architecture for computers since about 1975, when this schema began and was perfected in 1978 or thereabouts and implemented in 64-bit mainframe systems.

It's actually very simple and very elegant mathematically: any object in any n-dimensional frame of reference can be relationally located by only two variables if and only if those two variables are n-sum encoded.

Meaning, you only need to know two things about something to find it.  And finding it is as instantaneous as knowing the two things about it.

Of course, physical devices take real time to produce results, but the time is optimized when Matrix Theory is applied to them and the fastest possible rate of storage and retrieval is thereby effected.  You can also call it N-Dimensional Theory.

I know it may be hard to grasp, but it is real and it does exist and is used currently.  Microprocessors have just caught up to this design, so it's a little new to everyone.

Whatever Intel or AMD or Citrix call it, vector indirect reference or vector indirect addressing, it still boils down to the same thing, manipulating arrays and databases using dual indexed pointers and/or descriptors.  You can also look up these terms and figure them out in depth.

But remember, it's a hardware concept and desing which then begets software techniques because the hardware is always faster than the software.

Similar to: If I know one side and the included angle, for a right triangle I know the whole triangle.  Do you see this?  That is the basis of vectoring.

I hope so.
Top Expert 2014

Ah,  I do understand the memory addressing and matrix theory.  The part that got me from your original statmentment was "... even across networks, treating the entire network as a virtual memory."  I was trying to figure out how that would work.  Some of what you wrote is above my head, but I think I get the general idea.  

But that would require a central box that has the complete database indexed in memory.  Would it not?

But I think we are getting a bit off topic and I doubt if his database is that large, but it could be applied to a single box with a small database.  I have worked with a couple of application that sounds like they did that, built the database base in memory at boot time.  These were small databases as they were on boxes with about 4GB of RAM and they were 99% reads with only a few (3-4) updates a day.
LandSharkManager, Network Systems & Help Desk Services


Thanks to all for your comments and suggestions.  I wanted to get some idea on some type of benchmarks plus additional insight on what else to consider besides CPU and Memory.  This will be our Disaster Recovery server and will be hosting a SQL DB of 450GB.  Thanks to GenEric and giltjr for your help and comments.  
Access more of Experts Exchange with a free account
Thanks for using Experts Exchange.

Create a free account to continue.

Limited access with a free account allows you to:

  • View three pieces of content (articles, solutions, posts, and videos)
  • Ask the experts questions (counted toward content limit)
  • Customize your dashboard and profile

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.


Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.