previous
 next 
CS 3853 Computer Architecture Notes on Appendix B Section 2

Today's News: October 5, 2015
Assignment 2 is available

Read Appendix B.2

B.2: Cache Performance


Cache Performance 1

Example 1
Compare the miss ratios and access times of:
  1. 16KB instruction cache and 64KB data cache
  2. 256KB unified cache
Make reasonable assumptions to solve the problem.
Solution:
Assumptions:
  • Miss rates per 1000 instructions are given in Figure B.6 (on page B-15) as follows:
    16KB instruction: 3.82
    64KB data: 36.9
    256KB unified: 32.9
    These assume 36% instructions are loads and stores, as with some SPEC benchmarks.
    Assume a 2-way set associative cache with 64-byte blocks.
  • A hit takes 1 cycle
  • Miss penalty is 50 cycles
  • A load or store takes an extra cycle because the the structural hazard in the case of the unified cache.
  • Ignore stalls due to write-through.
miss ratiosplit = (3.82+36.9)/(1.36 × 1000) = .02994
miss ratiounified = 32.9/(1.36 × 1000) = .02419
The unified miss ratio is better!

This does not take into account the extra stall due to the structural hazard in the unified cache.
To calculate the average memory access time:
instruction access time = hit time + miss ratio × miss penalty
access timesplit = 1 + .02994 × 50 = 2.497 cycles.
access timeunified = 1 + .36 + .02419 × 50 = 2.57 cycles.
The split access time is better!

The next example explores the performance of direct mapped and set associative caches.
For a given size cache, the more associativity, the higher the hit ratio.
More associativity requires additional hardware (and time) to check a tag (even on a hit)
This might require increasing the clock cycle time.
Example 2
Which is faster, a direct mapped cache with a cycle time of .4 ns, or
a 2-way set associative cache with a cycle time of .45 ns?
We need some additional assumptions to do this problem:
  1. 1.3 memory accesses per instruction
  2. CPI of 1 with no cache misses
  3. miss penalty of 21 ns
  4. miss rate of direct mapped cache: 2.3%
  5. miss rate of 2-way set associative cache: 2.1%
  6. these are unified caches, but with no structural hazard
Solution
First, we need to know the miss penalty in cycles for each:
miss penaltydirect = 21ns/.4ns = 52.5 cycles
miss penalty2-way = 21ns/.45ns = 46.67 cycles
We round up the number of cycles for the miss penalty.
Second we calculate CPI for each:
CPIdirect = 1 + 1.3 × .023 × 53 = 2.5847
CPI2-way = 1 + 1.3 × .021 × 47 = 2.2831
What we really want it time:
Time per instructiondirect = 2.5847 × .4 ns = 1.0339 ns.
Time per instruction2-way = 2.2831 × .45 ns = 1.0274 ns.
In this case the 2-way cache is better by .6%.

With out-of-order execution, part of the miss penalty can be overlapped with the execution of other instructions.
Example 3
Redo the above problem if the 30% of the miss penalty can be overlapped.
Solution:
We just have to reduce the miss penalty by 30% in each case.
CPIdirect = 1 + 1.3 × .023 × 53 × .7 = 2.1093
CPI2-way = 1 + 1.3 × .021 × 47 × .7 = 1.8982
What we really want it time:
Time per instructiondirect = 2.1093 × .4 ns = .8437 ns.
Time per instruction2-way = 1.8982 × .45 ns = .8542 ns.
In this case the direct mapped cache is faster by 1.25%.

Next Notes

Back to CS 3853 Notes Table of Contents