### Cache performance and associativity



### Write through vs write back

Write through: every time data is changed in cache, change is done to lower level in hierarchy

Pro:

Con:

Write back: changes to lower level in hierarchy are only done when data is evicted from cache

Pro:

Con:

### Cache controller



P&H fig. 5.10

 $\odot$ 

2

# ????

What does the memory controller do when a cache miss occurs? What does the CPU do?

### Write buffers

A way to hide the cost of writing to lower level of hierarchy

Example: instruction sw t1, 4(a0) in write-through cache

- 1. Write to cache and write to write buffer happen immediately (simultaneously, 1 cycle)
- 2. Rest of execution can happen at the same time that write to main memory is happening from buffer

# ????

What happens if data that has been evicted from the cache is waiting in the write buffer and a read instruction for that address executes?



# ???

How many physical bits of space do we need to store our 1KB cache?



# ????

How do we measure the performance of a processor that uses caching?



#### Effect of algorithm on CPU time



#### Formulas from P&H 4.3

 $( \bullet )$ 

CPU time = (CPU execution clock cycles + Memory-stall clock cycles) × Clock cycle time

Memory-stall clock cycles = (Read-stall cycles + Write-stall cycles)

 $\text{Read-stall cycles} = \frac{\text{Reads}}{\text{Program}} \times \text{Read miss rate} \times \text{Read miss penalty}$ 

Write-stall cycles = 
$$\left(\frac{\text{Writes}}{\text{Program}} \times \text{Write miss rate} \times \text{Write miss penalty}\right)$$
  
+ Write buffer stalls

 $Memory-stall clock cycles = \frac{Memory accesses}{Program} \times Miss rate \times Miss penalty$ 



#### What causes cache misses? 3 Cs

**Compulsory** – bringing the first blocks into a cache ("warming up" the cache)

**Capacity** – cache not big enough to contain all of the blocks it needs

**Conflict** - blocks constantly evicted due to cache collisions

Can we decrease compulsory misses?

#### Increasing block size has limited effects





### Set-associative caches

