## The speculative algorithm



### The speculative CPU with ROB



#### Four stages of execution

Issue

If reservation station and ROB available, issue to both; update control entries Execute

If operands available, execute instr; otherwise wait (for stores: this stage only computes effective address)

#### Write

When result is available, send on CDB (update ROB, reservation stations) Commit

If normal commit or store: update register/memory and remove instr from ROB

If incorrectly predicted branch: flush ROB and reservation stations, fetch correct instr



### Example 4: speculative branch



J



# ???

### What other hazard can result from the store operation?

| Status                         | Wait until                                                                          | Action or bookkeeping                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
|--------------------------------|-------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Issue<br>all<br>instructions   | Reservation<br>station (r)<br>and<br>pop (t)                                        | $ \begin{array}{ll} \text{if (RegisterStat[rs].Busy)/*in-flight instr. writes rs*/ \\ \{h \leftarrow \text{RegisterStat[rs].Reorder;} \\ \text{if (RoB[h].Ready)/* Instr completed already */ \\ \{RS[r].V_i \leftarrow \text{ROB[h].Value;}, RS[r].0_i \leftarrow 0_i\} \\ \text{else (RS[r].0_i \leftarrow h_i) /* wait for instruction */ \\ \} else (RS[r].V_i \leftarrow \text{RegS[rs]:} RS[r].0_i \leftarrow 0_i\}; \\ RS[r].Busy \leftarrow \text{yes;} RS[r].Dest \leftarrow b; \\ ROB[b].Instruction \leftarrow \text{opcode;} ROB[b].Dest \leftarrow rd;ROB[b].Ready \leftarrow no; \\ \end{array} $ |
| FP<br>operations<br>and stores | ROB (b)<br>both available                                                           | $ \begin{array}{ll} \mbox{if (RegisterStat[rt].Busy) /*in-flight instr writes rt*/ \\ \{h \leftarrow RegisterStat[rt].Reorder; \\ \mbox{if (RoB[h].Ready)/* Instr completed already */ \\ \{RS[r].Vk \leftarrow ROB[h].Value; RS[r].0k \leftarrow 0; \} \\ \mbox{else } \{RS[r].0k \leftarrow h; /* wait for instruction */ \\ \} else \{RS[r].Vk \leftarrow Regs[rt]; RS[rl.0k \leftarrow 0; \}; \\ \end{array} $                                                                                                                                                                                              |
| FP operation                   | S                                                                                   | RegisterStat[rd].Reorder $\leftarrow$ b; RegisterStat[rd].Busy $\leftarrow$ yes; ROB[b].Dest $\leftarrow$ rd;                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| Loads                          |                                                                                     | RS[r].A $\leftarrow$ imm; RegisterStat[rt].Reorder $\leftarrow$ b;<br>RegisterStat[rt].Busy $\leftarrow$ yes; ROB[b].Dest $\leftarrow$ rt;                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| Stores                         |                                                                                     | $RS[r].A \leftarrow imm;$                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| Execute<br>FP op               | (RS[r].Qj == 0) and<br>(RS[r].Qk == 0)                                              | Compute results-operands are in Vj and Vk                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| Load step 1                    | (RS[r].Qj == 0) and<br>there are no stores<br>earlier in the queue                  | $RS[r].A \leftarrow RS[r].Vj + RS[r].A;$                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| Load step 2                    | Load step 1 done and<br>all stores earlier in<br>ROB have different<br>address      | Read from Mem[RS[r].A]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| Store                          | (RS[r].Qj == 0) and<br>store at queue head                                          | ROB[h].Address ← RS[r].Vj + RS[r].A;                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
|                                | Execution done at r and CDB available                                               | $\begin{array}{l} b \leftarrow RS[r].Dest; RS[r].Busy \leftarrow no; \\ \forall x(if (RS[x].0j==b) (RS[x].vj \leftarrow result; RS[x].0j \leftarrow 0\}); \\ \forall x(if (RS[x].0k==b) (RS[x].Vk \leftarrow result; RS[x].0k \leftarrow 0\}); \\ RD(B).Value \leftarrow result; ROB[b].Ready \leftarrow yes; \end{array}$                                                                                                                                                                                                                                                                                      |
| Store                          | Execution done at r<br>and (RS[r].Qk == 0)                                          | $ROB[h].Value \leftarrow RS[r].Vk;$                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Commit                         | Instruction is at the<br>head of the ROB<br>(entry h) and<br>ROB[h].ready ==<br>yes | <pre>d ← ROB[h].Dest; /* register dest, if exists */ if (ROB(h].Instruction==Branch) {if (branch is mispredicted) {clear ROB[h].RegisterStat; fetch branch dest;}; else if (ROB[h].Instruction==Store) {Mem[ROB[h].Destination] ← ROB[h].Value;} else /* put the result in the register destination */ {Regs[d] ← ROB[h].Value;}; ROB[h].Busy ← no; /* free up ROB entry */ /* free up dest register if no one else writing it */ if (RegisterStat[d].Reorder==h) (RegisterStat[d].Busy ← no;};</pre>                                                                                                           |

Figure 3.14 Steps in the algorithm and what is required for each step. For the issuing instruction, rd is the destination, rs and rt are the sources, r is the reservation station allocated, b is the assigned ROB entry, and h is the head entry of the ROB. RS is the reservation station data structure. The value returned by a reservation station is called the result. RegisterStat is the register data structure, Regs represents the actual registers, and ROB is the reorder buffer data structure.



•



# ???

### Can we do better than "predict branch not taken?"

#### **Backwards and forwards branches**

while(guard) {

· · · {

Backwards branches are very frequently taken (~90% by some counts) Forwards branches are a coin flip without any other information loop: bne s0 s1 end

- • •
- j loop

end: ...

j start

. . .

loop: ...

start: beq s0 s1 loop

