Link to home
Start Free TrialLog in
Avatar of duta
duta

asked on

Differences between clock speed and bandwidth

Hi!

While reading the previously posted answers to questions regarding memory and processor, I found that bandwidth and clock speed
was interchangeably (?) used, which was very confusing to me.

To my best knowledge, the performance of a computer processor is best measured by their clock speed (in MHz) and the  performance of a memory is best measured by bandwidth (bts, bytes per second).

I will truly appreciate if you computer experts may kindly explain for this computer novice whether clock speed and bandwidth may be interchangeably used for a processor and a memory or not.

Thanks as always!

duta
(Tuesday, May 3, 2005) at 12:50 a.m.















SOLUTION
Avatar of lombardp
lombardp

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of FalconHawk
FalconHawk

Note before: its possible some things i tell are also in lombardp post. My excuses for this,  but i cant write something down without having to use a few things that he also described, since it are facts ;).  

A computer excists out of a CPU (the processor), the RAM memory, and a system bus (and a lot more, but these are best for my example).
Now, lets say you got a 2 gigaherthz processor, and your not happy with the speed of it. If you replaced it with a 3 gigahetz, your speed would be 1 gigaherts more right? NO. A processors gigahetz shows how many instructions a computer can do every second. But, a processor is a lot faster then the RAM memory, so the processor preforms most times under what it could, by about 90%. People buying a faster CPU generally wont get any speed increase.

 Why is that so? why does a computer work at only 10% of his capacity?
The problem is that a CPU is just a lot faster then for example, RAM memory. Ram works at only 266 megaherts. And the system bus, the connection between the CPU and the ram, as well as every other device, works at 400-800 megahertz for the newer computers. You can easily see that this bandwidth, the ammount of data that is transfered, is a lot lower then the maximum that the CPU could take. Clockspeed is simply the processor speed, and bandwidth is the maximum data rate. And the slowest of that counts as the PCs speed.
Simple answer?

Clockspeed: how many cycles per second there are, I.E how many times  a tire rotate will rotate when being driven.

Bandwith: how many cars(and their wheels) can go down a highway at the same time.

two lane highway=low bandwith, 28 lane highway is abundent bandwith. After all alot more cars can go down a 28 lane highway at once than a two lane highway.
FalconHawk, actually modern CPUS do not work at 10% capacity, thanks to several architectural elements used to "hide" external memory bottleneck.

It is true that external memory bandwidth and latency are not matched with theoretical CPU limits, but in 99.9% applications, they are enough.

BANDWIDTH : Memory bandwidth is the maximum number of byte per second that can be transferred from external memory to processor. The CPU process data at an higher rate, but usually for very limited amount of time. Since in real applications the full bandwidth is necessary only for limited amount of time, internal cache memories have been provided with this additional bandwidth in order to sustain these peak requests.

LATENCY : Memory latency is the time between a memory request and the actual availability of data. Latency is more and more important today because of very long CPU pipeline and very high clock speed. In fact latency is an absolute value measured in ns (nanosecond) independent from CPU clock, so the higher the CPU clock, the higher number of clock cycles that are "lost" waiting for data from memory. Internal cache memories have the primary task of "hiding" latency of external memories, and there are two levels of cache because Level-1 cache has to be as fast as possible, while Level-2 cache has to be as big as possible.

Moreover,
Athlon64 CPUs have an integrated memory controller for external memories in order to have latency as lower as possible. Pebntium 4s' Hyper-Threading is intended to share execution units across different threads in order to let one thread run while the other is waiting data from memory, and so on.

There are even more sophisticated enhancements, such as out-of-order execution, data prefetching, speculative execution, and so on...
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
First of all, you should consider the relation between the clock speed and the instructions executed.

In the processor an instruction is executed in different parts. Eg. if you add two numbers, the separate bits of the numbers are each added in a different element of the processor (each of those is calle a carry save adder, CSA). It is very important that those all work perfectly synchronously. In that respect a signal (the clock) is used to synchronise between the elements. Each time the clock ticks, a next step can be taken. The speed at which the clock ticks is expressed in MHz.

Now the number of instructions executed depends on the clock speed. However, it also depends on the "instruction set". In fact in some processors someinstructions take much more clock cycles than in others. So you can compare two processors of the same type (eg. two pentium 4) by comparing their clock speed. You can not compare two different processors (eg. pentium 4 vs. celeron) on basis of their clock speeds. In comparing, be shure there are no other elements which differ : you can not compare eg. a pentium 4 which 128 kB cache with one having 256 kB cache if you only take clock speed into account.

So roughly speeking, clock speed relates to the processing speed.

For data transfers, such as from memory to the processor, there is also a clock (different from the one which regulates the speed of the processor, but both are synchronised). However per clock cycle some bits are transferred. The product of number of bits transferred times the clock speed (for the bus) is called the bandwidth.

Hope this helps
I believe this is what your looking for, there are pictures (figures) on the page that help visually understand better. I'd goto the link below and read that page.


CPU clock rates have experienced an exponential growth, leaving the rest of the PC components behind. In the resulting high-end systems, the memory bus constitutes the probably most important bottleneck. Ramping up bandwidth only partially solves the problem since latencies become the primarily important bottleneck. Reducing latencies by means of faster strobes is technically difficult and economically not viable, particularly if the bus speed is increased to approach 200 MHz. Alternative solutions encompass combined SRAM-DRAM solutions that, at minimal overhead, mask latencies by uncoupling data output from the DRAM array. A simple row cache architecture, employing time multiplexed internal buses to load entire rows into centrally located 8 kb SRAM cells, can function as output buffer. Thus, the DRAM array can be precharged ahead of time to avoid page closing latencies as well as refresh penalties.


DRAM Performance: Latency vs. Bandwidth

Between 1994 and 1997, the discrepancy between the memory bus frequency and the clock speed of the processor was about 3 times with internal clock multipliers of 3 or 3.5 depending on the CPU speed grade. In August 1998, this situation drastically changed with the introduction of the Pentium II and the multiplier-locked versions of the Celeron. The clock multiplier increased to 4-5 times the system bus (front side bus, FSB). The Pentium II bus speed increased to 100 MHz while the Celeron remained at 66 MHz creating high-end and low-end processor families segmented by memory performance. The larger processor clock multiplier is an indication of the growing gap between CPU and memory speed.

Source: http://www.ntsi.com/DDRRam_Explained.htm
Avatar of duta

ASKER

Wow from duta:

Thank all of you so much for your great, super-prompt response to my question.

I am so impressed as much with the depth of your computer knowledge as with the depth of your heart to help out a computer novice.

I read all of your answers line by line, word by word with my utmost appreciation.

I should be at a loss how I should split the points promised because all of your answers are just great.

By the way, using this opportunity, I wonder whether I may ask a question relevant to this thread:

How is bus best measured?  By clock speed or bandwidth?

Many, many thanks to all of you who kindly responded to my question.

Best regards,

duta
Tuesday (May 3, 2005) at 9:03 a.m.
> How is bus best measured?  By clock speed or bandwidth?

Bandwidth. The primary function of a bus is data transfer, so clock speed is not relevant as long as a certain bandwidth is provided.

Avatar of duta

ASKER

TO: lombardp:

You amazed me again with the the speed at which you responded to my question.

Great people like you make this site such a great place for a novice like me to learn.

I will come back to you after reading respones from some more people.

Many thanks to you!

Best regards,

duta
Tuesday, May 3, 2005, at 9:13 a.m.
When you say bus , they also improved them by making them go from 133 mhz or there abouts to 400 and even 800 mhz on the P4 systems, didnt they ? Correct me if I am wrong ? So to some extent surely the mhz clock speeds affect it to some degree ? Anyway , I am not entirely sure of this but figured I would post it just to see what other experts input was on this :)
gecko_au2003, you are right.

If the same bus is clocked higher, you obtain an higher bandwidth.
But in general there can be an 800MHz bus that has a lower bandwidth than a 400MHz bus.

Avatar of duta

ASKER

TO: lombardp:

Thank you so much for your kind, prompt responses.

I just wonder whether I may ask one more question relevant to this topic:

Are L1, L2 cache memory best measured  by bandwidth because they are is a type of memory?

Thanks as always!

duta
Tuesday at 10:12 a.m.



Most busses can even be faster if you mod the system a bit. In fact, most pc manufacturers trottle down the system bceause of compatibility issues. See it like a speed limiter on a car. It can be faster, but its limited. Most BIOSes even have the option to increase the bus speed, and some can also increase the CPU speed. Doing this is really simple: acces the bios and change the setting. But keep in mind that it can make the system unbootable. No worrie trough, you can just change it back with no issues.

gecko_au2003  wrote:
Correct me if I am wrong ? So to some extent surely the mhz clock speeds affect it to some degree?

True. it is a double sided coin. More bandwidth means that more data can be transfered every cycle, and more megahertz means that it can be transfered faster. Look at it like this: i can transfer 1 kilo in 100 minutes. Thats the same thing as 2 kilo in 200 minutes. The "weight" is the bandwidth, the "minutes" are the clock speed. This is just an EXAMPLE, so please dont go messaging width all the lil and big details i left out.

duta wrote:
I should be at a loss how I should split the points promised because all of your answers are just great.

Simply make accepted and assistent answers. The division of points will be higher then.
Oh, and a little out of topic now: Glad to help ya ;) its kinda rare a askers says thank you in such an... abundant way as you did in that post ;). (NO, this isnt fishing for points, just something i really wanted to say, since it makes me kinda happy ;) )
Are L1, L2 cache memory best measured  by bandwidth because they are is a type of memory?

L1 and L2 caches are special types of memory. They are created to make the PC more efficient. The L1 cache is a very small piece (a few KB only) of high speed memory. This memory is costing a lot, so they dont use it everywhere. If a PC processor has done its instuctions, it normally has to wait before the bus can carry the next instruction out. And that causes a LOT of waiting periods. The L1 cache is a lot faster then the bus, so the PC stores the instructions that are ready there, till the rest of the components can catch up. In the meantime the CPU can keep on calculating, so its much more efficient.

L2 cache is also high speed memory, Faster then normal RAM, but slower then the L1 chache. Its most times around a mb.
The L2 follows the 80/20 rule. This is a programmers rule. 20% of the code runs 80% of the time. This is because normally programs have a loop in it; a recurring event patern like: seeing what key is pressed, or showing the key on the screen. Thats about 20% of the total program. The other 80% is only used at startup or rarely. These 20% are stored in the L2 cache. Because its the most needed piece of data, its kept in the L2, that is a LOT faster then regular memory. Because of that the most used things can be executed faster.

Now: bandwidth or clockspeed? Bandwidth, since it indeed is memory. and even trough its faster, the CPU is STILL a lot faster  
Goto the link below , it has a great explaination.  Falcon has a good description too but it could confuse some people. The info here is a little more explained on L1 and L2 and why L2 is bigger than L1.


Most processors come with two level of caches. The first level is integrated within the processor core itself and is thus the fastest. Data stored there can be used by the processor at no cost in clock cycles at all. The next level of cache, called L2 or secondary cache, is situated outside the core and usually runs at a lower clockspeed than the core though it may also run as fast as the core itself. However, even if it runs at the same clockspeed as the core, it will be slower than the L1 or primary cache because it's not part of the core itself.

In this article, we will be looking at the L2 or secondary cache. It's situated outside the core, either within the same package or in a separate package. So, its throughput will be lower than the L1 or primary cache's. However, the size of the L2 cache is always much larger than the L1 cache and if it runs fast enough, its throughput can come close to that of the L1 cache. So, the L2 cache plays a very important role in maintaining the high memory throughput to the processor.

Source:
http://www.adriansrojakpot.com/Speed_Demonz/L2_Cache_Latency/L2_Cache_Latency_02.htm
Actually all modern CPUs have L2-cache integrated within the CPU core together with L1-cache.
Latest Intel Xeon have also 1MByte on-die L3-cache.

Avatar of duta

ASKER

TO: All :
FROM: duta

I am totally amazed at the depth of your computer knowledge and your heart (as shown in your enthusiam to teach a computer novice like me). I wish that I have a million points at my disposal so that I may properly show my appreciation.

By the way, since I am meeting a lot of computer experts here, I just wonder whether I may ask a question which is relevant to a computer but not relevant to this topic:

My question is: What is the difinition of a "data file (or file data) server" in comparison to a "file server"?
I have already posted this question in a different thread under Windows 2003 server(https://www.experts-exchange.com/questions/21410444/What-is-data-file-server-or-file-data-server.html), but I haven't gotten a clearcut answer.

Please forgive me if I asked you one more question after another here and violated (?) any site rule or policy.

Finally, I am going to accept answer to my original question in this topic very soon (I know that many of you are answering questions in this site not for points, but to share your knowledge with others in a great spirit of self-sacrifice and educating
the less-educated).

Many thanks to all of you as always!

duta
Wednesday at 11:40 a.m.
I love working on this EE forum, great place to learn and help others out :) Points are no biggy to me as you can see on my profile I have participated in a lot more questions as compared to the answered questoins I have hence the answers which accumlate my points  :)

If that makes any sense lol
Avatar of duta

ASKER

TO: lombardp:

I chose your answer as accepted solution (worth 300 points), abd split the rest of 200 points equally between FalconHawk and gecko_au2003 whom I chose as assisted solution.

But for some reason which I can not comprehend, your reply is shown as "Assisted Answer".  I would like to do everything to correct this mistake for which I think experts-exchange may be responsible.

For this mistake, I would like to extend my sincere apology and hope you may generously accept my apology.

Finally, I am very grateful to all of you for so generously sharing your knowledge with a novice like me.

God bless you!

duta
Thursday (May 5, 2005) at 12:20 a.m.

TO: duta

No problem! It's ok.
There's no need to correct the mistake.

Thank you for your offer.

Thanks for the points, and also a thanks for the experts that helped here ;). This will undoubtly make a great solution if you search for one. Keep up the good work everyone ;)
FHawk
Thanks falconhawk, any chance of a gold star lol :D *grins* J/K.

Anyway hopefully there will be more questions like this on EE :)
Avatar of duta

ASKER

TO: All
FROM: duta
DATE: May 5, 2005, at 11:28 am

Thank all of you for taking time out of your busy schedule to share your expertly knowledge with a novice like me.
God bless you!