The development of cache meory sytems

Posted on 2003-11-23
Last Modified: 2010-04-26
I would like to know about the development of cache over the last 35 years
Question by:reededdie
LVL 18

Expert Comment

ID: 9806437
Did your instructor ask this question or is there a reference in the textbook? Which textbook?
LVL 13

Expert Comment

ID: 9806525
Cache memory was always left up to motherboard manufacturers until intel developed the 486 cpu.  You would only find it on motherboards, not on the chip itself.  With the advent of the 486, intel started putting 8Kb of SRAM cache (pipelined) into the chip itself, which we would later call "L1" cache, with "L2" cache being the cache on the board.

Later, L1 and L2 cache would both be on-chip, starting with the PII processor.  Up until then all 586 class boards ( for pentium, amd k5/k6 offered L2 cache on board, usually 512Mb of 'pipeline-burst' cache was offered on the 586 class boards.

Now you can get L1, L2 and even L3 cache on-chip.  The Intel P4 'EE' 3.2Ghz offers 1024Kb of L3 Cache on-chip, maximizing it's potential as a cpu.  Basically, the P4 'EE' 3.2Ghz it's like a 800MHz FSB Xeon processor, imho.

If you're interested, here's a brief history of the microprocessor I added to a question a few days ago:

"Intel's first chip for the microcomputer was the 8086.  That was in 1978. The 8086 It had a 16 bit data bus, and an amazing speed of 4.77MHz.

In 1979 they released the 8088.  PC's were still in the experimental stage, so to speak, and most all the hardware available was 8 bit.  As such, Intel reduced the data bus in the 8088 to 8 bits, but other than that it was exactly the same as the 8086.

The 8088 was also capable of addressing 1 Mb of memory.  Woo Hoo, feel the power.

IBM chose the 8088 for it's first PC, and outfitted it with DOS.  Millions were sold, and personal computing was born.

Just as IBM's Personal Computer was hitting the markets, Intel was developing a new processor, the 80286.   (Sound familiar?)  The 286 ran at speeds from 6 MHz to 20 MHz and had a 16 bit data bus and a 16 bit memory bus which allowed it to address up to 16 Mb of memory.  More Power!

The 386 processor was a powerhouse compared to the 8086, 8088 and 286.  It had a 32 bit wide data bus and address bus with the ability to address up to 4 Gb of memory. (Although nobody EVER seemed to manufacture boards that could handle it).  Clock Speeds ran from 16 to 33 MHz. The motherboards had a slot for an optional math coprocessor.

AMD and Cyrix were also producing 386's and up until this time intel didn't seem to mind sharing the technology.  After this third generation though, that would change.

Later, intel came out with a 16 bit data bus version of the 386, and called it the 386SX.  It was less expensive, and because of the new 'SX' name, the original was renamed 386DX.   Intel marketed the 386SX as an entry-level system.

The fourth generation of Intel's processors (the 486) was aimed at improving the performance of what they had. They maintained the 32 bit address and data bus, and the speed of the system bus remained at 25 and 33 MHz. There were still 3 operating modes (which carried over from the 386, BTW), real mode, protected mode and virtual mode.   Intel also integrated a math co-processor into the 486 chip.

Another thing they decided to incorporate into the chip was a small amount of very fast SRAM (8K) for an internal cache. Up to this point, cache installation and support had been left up to the motherboard manufacturers. Just having this small amount of internal (L1) cache alone, without having to travel the system bus for access, would speed things up! But Intel learned that by "pipelining" the information directly to areas of the CPU that dealt with it, the processor could be working on more than one instruction at a time.

This could allow the 486 to process up to an entire instruction in a single clock cycle.  Previous to this, the processors couldn't even calculate one instruction per clock cycle.  Considerably less, in fact.

As with all things intel, the 486 changed into two versions, the 486SX and 486DX.  The 486SX was essentially the same, but without the math co-processor.  Later development of the chip allowed for dissociation from the system clock. An internal multiplier inside the CPU could increased the internal operating speed of the processor by 2X and 3X. The new versions were named the 486DX2 and the 486DX4 (although it was called the DX4, the internal operating speed was only increased by 3X). Now there was a whole family of fourth generation CPUs from Intel. With system bus speeds of 25 and 33 MHz, the processors ran at 25, 33, 50, 66, 75, and 100 MHz."
LVL 13

Expert Comment

ID: 9806550
FYI also, it was in 1998 Intel’s low-cost Celeron processor, code-named Mendocino, as well as AMD’s upcoming K6-3 chip and future devices from Cyrix all embedded L2 cache memory, boosting logic-to-memory bandwidth to increasing performance.

So by 1998 all new processors being sold were now embedding L1 and L2 cache into the chips.

This, btw, put a bit of a 'hurt on' the SRAM manufacturers who up until then could rely on motherboard manufacturers needing to use their SRAM for cache in their product.

There's a good article on all that here, from 1998:
Use Case: Protecting a Hybrid Cloud Infrastructure

Microsoft Azure is rapidly becoming the norm in dynamic IT environments. This document describes the challenges that organizations face when protecting data in a hybrid cloud IT environment and presents a use case to demonstrate how Acronis Backup protects all data.

LVL 13

Expert Comment

ID: 9806556
More for you, basically definitions:

Cache Memory

A small fast memory holding recently accessed data, designed to speed up subsequent access to the same data. Most often applied to processor-memory access but also used for a local copy of data accessible over a network etc.

When data is read from, or written to, main memory a copy is also saved in the cache, along with the associated main memory address. The cache monitors addresses of subsequent reads to see if the required data is already in the cache. If it is (a cache hit) then it is returned immediately and the main memory read is aborted (or not started). If the data is not cached (a cache miss) then it is fetched from main memory and also saved in the cache. This depends on the cache design but mostly on its size relative to the main memory. The size is limited by the cost of fast memory chips.

The hit rate also depends on the access pattern of the particular program being run (the sequence of addresses being read and written). Caches rely on two properties of the access patterns of most programs: temporal locality - if something is accessed once, it is likely to be accessed again soon, and spatial locality - if one memory location is accessed then nearby memory locations are also likely to be accessed. In order to exploit spatial locality, caches often operate on several words at a time, a "cache line" or "cache block". Main memory reads and writes are whole cache lines.

When the processor wants to write to main memory, the data is first written to the cache on the assumption that the processor will probably read it again soon. Various different policies are used. In a write-through cache, data is written to main memory at the same time as it is cached. In a write-back cache it is only written to main memory when it is forced out of the cache.

If all accesses were writes then, with a write-through policy, every write to the cache would necessitate a main memory write, thus slowing the system down to main memory speed. However, statistically, most accesses are reads and most of these will be satisfied from the cache. Write-through is simpler than write-back because an entry that is to be replaced can just be overwritten in the cache as it will already have been copied to main memory whereas write-back requires the cache to initiate a main memory write of the flushed entry followed (for a processor read) by a main memory read. However, write-back is more efficient because an entry may be written many times in the cache without a main memory access.

When the cache is full and it is desired to cache another line of data then a cache entry is selected to be written back to main memory or "flushed". The new line is then put in its place. Which entry is chosen to be flushed is determined by a "replacement algorithm".

Some processors have separate instruction and data caches. Both can be active at the same time, allowing an instruction fetch to overlap with a data read or write. This separation also avoids the possibility of bad cache conflict between say the instructions in a loop and some data in an array which is accessed by that loop.

Primary cache (L1 cache, level one cache)
A small, fast cache memory inside or close to the CPU chip. For example, an Intel 80486 has an eight-kilobyte on-chip cache, and most Pentiums have a 16-KB on-chip level one cache that consists of an 8-KB instruction cache and an 8-KB data cache.

Secondary cache ("Second level cache", "level two cache", "L2 cache")
A larger, slower cache between the primary cache and main memory. Whereas the primary cache is often on the same integrated circuit as the central processing unit (CPU), a secondary cache is usually connected to the CPU via its external bus.
LVL 12

Expert Comment

ID: 9806986
>"Intel's first chip for the microcomputer was the 8086.
What about the 8080? That was the first PC processor.
LVL 13

Expert Comment

ID: 9807063
Thel 8080 was a very early CPU from Intel, yes.  It was released in '74 and was 2MHz, from what I can recall.

Although generally considered to be the first truly usable microprocessor design, and used in some early computers, it formed the basis for machines running cp/m, and wasn't considered to be a "PC" processor.

I guess that's what I should have specifically said was "for the personal computer", not the microcomputer.  My mistake.
LVL 14

Expert Comment

ID: 9808343
Gee AlbertaBeef, I wish you had done my homework for me. T
LVL 13

Accepted Solution

AlbertaBeef earned 500 total points
ID: 9808352
lol, why is everybody assuming this is homework.  And like I said in another place, I didn't provide anything this user couldn't have found by googling.  I don't assume something's homework unless it's really cut-and-dried, ok?


Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this article you will get to know about pros and cons of storage drives HDD, SSD and SSHD.
What do we know about Legacy Video Conferencing? - Full IT support needed! - Complicated systems at outrageous prices! - Intense training required! Highfive believes we need to embrace a new alternative.
The Email Laundry PDF encryption service allows companies to send confidential encrypted  emails to anybody. The PDF document can also contain attachments that are embedded in the encrypted PDF. The password is randomly generated by The Email Laundr…
I've attached the XLSM Excel spreadsheet I used in the video and also text files containing the macros used below.…

756 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question