CS 3843 Computer Organization Notes on Chapter 4, Section 4.3

Today's News: April 14

Don't forget the course evaluations

Section 4.3: Sequential Y86 Implementation

Section 4.3.1: Organizing Processing Steps into Stages

Recall of Y86 Instrucion Encoding available here.

Fetch
- Read the instruction into memory using the address in the PC
- The first half of the first byte is the icode or instruction code and it determines which type of instruction it is.
- The second half of the first byte is the ifun which in some cases specifies the subtype.
- The icode determines whether there is a register specifier byte, in which case rA and rB are set.
- The icode also determines whether a 4-byte valC will be fetched.
- Next it sets valP to the old PC plus the number of bytes fetched
Decode
- If necessary read values from the register file and set valA and valB.
- The registers are specified by rA and rB except for push and pop which use %esp in place of rB.
Execute
- What it does depends on the icode.
- Some instructions feed values into the ALU to obtain a valE and possibly set the condition codes.
  e.g. OP1, rmmovl, mrmovl
- Some instructions will check the condition codes and change the valP
Memory
- May read from or write to memory
Write back
- May write up to two values to the register file
  pop will update both the stack pointer and the register popped into.
PC update
- PC is set to valP

Note: You will need a password to get to the figures on this web page.

Look at Figure 4.22 and Figure 4.23

Today's News: April 16

Don't forget the course evaluations
New course added to fall schedule:
CS 4413 Web Technologies

We assume that everything in Figure 4.23 is combinational logic except for:

The PC
The Register File
The CC register
The data memory

All of the above are controlled by a single clock and store values only when that clock goes high.
Each time the clock goes high one instruction is executed.

Basic idea:

When the clock goes high, a new value is stored in the PC.
After some propagation delay, the outputs of the instruction memory are available.
These go through additional logic to provide the inputs to the ALU and the read inputs of the register file.
After an additional delay, the read inputs to the Data Memory are correct.
We assume that reading from the data memory is combinational, so after an additional propagation delay, the outputs of the data memory are valid.
We now have the write values and write address inputs to the register file and the data memory valid.
On the next rising edge of the clock, values are stored in the PC, and if necessary in the Register File, Data Memory, and condition code register.

The clock speed is determined by the longest data path through the combinational logic for the most complicated instruction.

This is the basic idea behind RISC design.
The speed of the simplest instructions is determined by the speed of the most complicated instruction.

An example: Tracing OP1 rA, rB

Fetch
Look at Figure 4.27
6 bytes are read from the instruction memory (even though only 2 will be used for this instruction).
The first byte is split into the iCode and iFun values.
iCode is 6 in this case and iFun is one of 0, 1, 2, or 3.
The second byte of the instruction is divided into rA and rB.
Note that for all instructions, rA and rB are always in the second byte.
iCode is used to generate the control lines: Need ValC and Need regids.
These in turn generate the number of bytes needed for this instruction: 1, 2, 5, or 6.
This gets fed into the incrementer (which is really an adder, which adds 1, 2, 5, or 6 to the PC).
Decode
Look at Figure 4.28
valA and valB are available from the register file.
Execute
Look at Figure 4.29
iFun determines which ALU function is used.
In the case iCode selects valA and valB as the ALU inputs.
Condition codes are also set if necessary.
Memory: nothing for this instruction
Write Back
The result from the ALU, if stored in rB.
Look at Figure 4.28.
rB is fed into dstE and the output of the ALU is valE.
PC update
See Figure 4.23.
Because this is a OP1 instructions, iCode selects the valP input to become newPC.

Note: Each instruction is executed in a single cycle.

Today's News: April 18

Don't forget the course evaluations

Question:

Design the circuit that calculates the value to be added to the PC based on the values of Need ValC and Need regids

Answer:

Question:

Design the circuit that calculates Need ValC and Need regids from the value if iCode.

Answer:

Interrupts
This material is not in the book.
External hardware devices need a way to interrupt the normal execution of the instruction cycles.
Some examples of devices that need to do this:

hardware timers: used for keeping track of the time
keyboard: notify the CPU that a key has been pushed
mouse
disk drive: requested data is available
MMU: invalid memory reference or page fault

How this works:

The CPU chip has one or more pins for interrupts.
At the end of each instruction cycle, the pins are checked.
If interrupts are enabled and one of these pins is active an interrupt occurs.
The registers are saved and an interrupt handler is called.
After the handler completes, the registers are restored and the program resumes from where it left off.

The Instruction Cycle

The tradition instruction cycle:

Fetch instruction
Increment PC
Decode instruction
Execute instruction
Handle Interrupts

Note that the fetch, increment PC, and decode may be implemented in several steps:

Fetch the opcode
Increment the PC
Decode the opcode and determine the number of arguments
Fetch the arguments, incrementing the PC as we go
Decode the arguments

Today's News: April 21

Today is the last day for the course evaluations

How the traditional instruction cycle relates to the Y86 instruction cycle:

Fetch: We always fetch 6 bytes, even if the instruction is 1, 2, or 5 bytes
Instead of incrementing the PC, we set valP to the new value of the PC based on the number of bytes of the instruction.
Decode: traditional
Execute: The traditional execute has 3 phases in the Y86. First, new values are calculated.
Memory: Values are read from or written to memory
Write back: The register file is updated
PC update: The PC is updated using valP

Why is it done this way in the Y86?

The Y86 executes one instruction per cycle.
Each cycle can have only one change to sequential logic, since all use a positive edge trigger.
Instruction memory is combinational, a change to the PC changes the instruction coming out,
so we must get all of the instruction before we know how many bytes it is.
Everything is combinational logic except for:
- Condition codes
- Data Memory
- Register File
- PC register
Fetch, Decode, Execute, and Memory read are all done together with combinational logic
Memory write, Write Back, and PC update are all done at the same time at the beginning of the next clock cycle.

Example: Tracing jxx Dest

Fetch
Look at Figure 4.27
6 bytes are read from the instruction memory (even though only 5 will be used for this instruction).
The first byte is split into the iCode and iFun values.
iCode is 7 in this case and iFun determines the type of jump.
The next 4 bytes are the value of valC which is the destination address if the jump is taken.
In this case Need ValC is true and Need regids is false, which will feed 5 into the PC incrementer.
Decode
Nothing needs to be done here.
Execute
Look at Figure 4.29
Cnd generated by the values if CC and iFun to determine whether the jump will take place.
Memory: nothing for this instruction
Write Back: nothing to do for this instruction
PC update
If Cnd is true, PC is set to valC, otherwise to valP.

Today's News: April 23

Thank you for filling out the evaluation forms

Example: Tracing rmmovl rA, D(rB)
A more detailed version of this trace can be found here.

Fetch
Look at Figure 4.27
6 bytes are read from the instruction memory (in this case all are used).
iCode is 4 and iFun is not used.
valC is set as well as valP = PC+6.
Decode is like OP1:
Look at Figure 4.28
valA and valB are set.
Execute
Look at Figure 4.29
iCode determines which ALU function is used (add).
valE = valB + valC
Memory: Look at Figure 4.30
valE is the memory address and valA is the data.
Write Back: nothing to do for this instruction
PC update
PC is set to valP

Figure 4.28

Figure 4.29

Figure 4.30

Today's News: April 25

Figure 4.18 OP1, rrmovl, irmovl
Figure 4.19 rmmovl, mrmovl
Figure 4.20 pushl, popl
Figure 4.21 jXX, call, ret