previous
 next 
CS 3843 Computer Organization
Notes on Chapter 4: Section 4.2


Today's News: March 28
Second exam on Wednesday of next week.
Today is the last day to turn in Assignment 3.


This section deals with the design of hardware that makes up the basic building blocks of a computer.

Section 4.2.1: Logic Gates

The simplest building block we will discuss in this course is the gate.
A gate has a small number of inputs (usually 1 or 2) and one output.
Logically, gates perform a Boolean function such as AND or OR.
Basic gate symbols:
basic gates

In the old days, logic designer would draw diagrams based on these and similar gates to design a circuit.

Today, circuits are designed using a Hardware Description Language (HDL).

We will use a simple form of this called HCL (Hardware Control Language).
This looks very much like C code.
For example, an And gate is represented by a && b.


Sections 4.2.2 - 4.2.4: Combinational Circuits

Example: bit equal
bool eq = (a && b) || (!a && !b);
This is equivalent to the following circuit:
bitequal

Here is an alternate way of showing the same circuit:
bitequal1

Here is a block diagram:
biteq

We can make a multi-bit equal out of 1-bit equals:
workeq
And here is a block diagram:
wordeqblock

Equals Circuit


Example: 1-bit multiplexer
It allows you to select one of two one-bit inputs and is described by
bool out = (s && a) || (!s && b);
muxonebit
Here is a block diagram:
muxbit

We can make a multi-bit (word level) mux out of 1-bit muxes:
muxword
Here is a block diagram for this:
muxwordblock


Here is the HCL description of the mux:
int Out = [
   s: A;
   1: B;
];

The [...] is like a select, it means if s is true, the result is A.
Otherwise, we check the next case. 1 is always true, so we select B.

Today's News: March 31
Second exam on Wednesday of this week.


Question:
In the 2-input mux described above, which input is selected when s=0?
Answer:


You can also make a mux that selects from more than 2 words.
Here is a 4 word mux, called a 4-way mux:
muxwordblock4
Here is the HCL description of the 4 word mux:
int Out4 = [
   !s1 && !s0: A;
   !s1       : B;
   !s0       : C;
   1         : D;
];


Question:
In the 4-input mux described above, if s1s0 is considered a 2-bit binary number with s0 as the low order bit, which input is selected by each of the s-values: 0, 1, 2, and 3?
Answer:


The diagrams and HCL for the 2-input and 4-input muxes were taken from the textbook.
Note the inconsistency as to the order of the inputs.
In one of the diagrams, the smallest value of s selects the top input and in the other the smallest value of s selects the bottom number.

Multiplexers 1


Other Gates and Basic Building Blocks

This is not in the book.

Another common gate is the XOR:
xor
which is the same as
xorcircuit

Any circuit that can be made from and, or and not gates can also be made just using and and not because:
a || b = !(!a && !b);
Similarly, you can use just or and not:
a && b = !(!a || !b);
These pairs (and, not) and (or, not) are called functionally complete sets.

Problem:
Show that and alone is not a functionally complete set.

Are there any single gates that can be used as a functionally complete sets? Yes!

The NAND Gate:
nand

The NOR Gate:
nor
Two input gates

Adders

This is not in the book. Some diagrams are from Wikipedia.

It is easy to design a 1 bit adder that has 2 inputs and 2 outputs:
180px-Half_adder
This is called a half adder.

Today's News: April 4
Exam 2 returned today.


We could make a 2-bit adder from this, but it would be more convenient to make it from full adders, which add 3 bits.
400px-Full_Adder
Here is a block diagram of a 1-bit full adder:
215px-1-bit_full-adder


You can use this to make a 4-bit full adder which takes as input two 4-bit numbers and a carry coming in and produces 5 bits of output.
500px-4-bit_ripple_carry_adder
This is called a ripple carry adder. Each successive adder cannot produce the correct output until the previous carry is produced.

How could you make a 16-bit adder from four 4-bit adders?

Problem: Design a word level subtracter.

Adder Gate Count


Today's News: April 7

Propagation Delay

This is not in the book - diagrams are from Wikipedia.

Here is a full adder with the propagation delays shown:
400px-Full-Adder_Propagation_Delay


There are three gate delays to produce the carry, and only 2 to produce the sum.
However, there are only 2 gate delays from the carry in to the carry out.
If we have a 4-bit ripple carry adder, the first carry requires 3 gate delays and each additional level is 2 gate delays, for a total of 9 gate delays.
Note: We are assuming (incorrectly) that the propagation delay is the same for all gates and we are ignoring the propagation delay of the wires.

Question:
A processor runs at 3 GHz and requires that all propagation delays are less than 1/3 of a nanosecond.
Estimate the length of a wire that would produce a 1/3 nanosecond delay.
Answer:


Important fact: in theory, any boolean function can be created using only 2 gate delays (not counting inverters).
This can be done with a 2-level AND-OR network or a 2-level OR-AND network.

Explanation 1:
The rules of Boolean algebra allow you to write any Boolean function of 2 variables, a and b in the form:
    f(a,b) = c0a'b' + c1a'b + c2ab' + c3ab
Where the c's are either 0 or 1 and a' = !a, and b' = !b, and we use + for OR and multiplication for AND.

Similarly, any Boolean function of 3 variables can be written as a combination of a'b'c', a'b'c, ab'c', etc.

Example:
s = f(a,b) = (a ^ b) ^ c
Using X ^ Y = X'Y + XY', we get
s = f(a, b) = (a'b + ab')'c + (a'b + ab')c'
            = ((a'b)'(ab')')c + a'bc + ab'c 
            = (a + b')(a' + b)c + a'bc + ab'c
            = (aa' + ab + b'a' + b'b)c + a'bc + ab'c
            = abc + b'a'c + a'bc' + ab'c'


This is just the calculation of the sum for the 1-bit full adder.
Note that all it says is that the sum is true if the number of true inputs is 1 or 3.

Explanation 2:
Any Boolean function can be described by a truth table. Let us look at the truth table for the calculation of the sum of a one-bit full adder:
a   b   cin s
0000
0011
0101
0110
1001
1010
1100
1111
Take all of the lines in which the output is 1, and form a Boolean expression:
   s = a'b'c + a'bc' + ab'c' + abc


Here is the corresponding 2-level AND-OR network:
addersum


We can do something similar with the carry out of a 1-bit full adder, but it is simpler to get a formula by noting that there is a carry if at least two of the three inputs are 1:
   cout = ab + ac + bc
and the circuit looks like this:
addercarryout

Two Level Adder


It is possible to produce similar formulas for the other 3 carries of a 4-bit full adder and put them into a circuit that will create all carries with 2 gate delays.
If all carries are produced with 2 gate delays so that sums are produced 2 gate delays later. Total gate delays for the 4-bit adders: 4 instead of 9. It is more than twice as fast!
For a 64-bit adders, a ripple carry adder would have 129 gate delays. If all carries are produced with 2 gate delays, the sums are produced 2 gate delays later. Total gate delays for a 64-bit adder: 4 instead of 129. It is more than 30 times faster.

In practice, this is not feasible for 32-bit or 64-bit adders.
There are limitations based on the number of inputs a gate can have and on how many inputs can be driven by a single output.
A fast adder might look like the diagram below for a 4-bit carry lookahead adder:
500px-4-bit_carry_lookahead_adder
The p and g outputs of the full adder are independent of the carry and are available after 2 gate delays.
All of the carries are generated 2 gate delays later.
The PG and PP outputs are available at about the same time and allow nesting of the logic so that C8, C12, C16 can be available about 2 gate delays later.
It allows for a design of an n-bit adder with a propagation delay proportional to log n instead of n.
Wikipedia has a good discussion of the details.

The ALU

An Arithmetic/Logic Unit is a circuit that can produce one of several arithmetic (add, subtract, etc.) or logical (and, or, etc.) functions.

The Y86 will need an ALU that can add, subtract, and, and xor.

Basically, this ALU would be designed like this:
aludesign


Here is a block diagram of this ALU:
alublock

Today's News: April 9

Sections 4.2.5: Memory and Clocking

Note: Most of this is not in the text.

So far we have talked about combinational circuits.
These must follow certain rules such as The second requirement ensures that any output is determined by the inputs of the network.
If we relax this constraint, we can get something like the diagram below which is called an SR latch.
It is a basic building block of a memory circuit.
The SR Latch
200px-RS_Flip-flop_(NOR)

Note that the output of a NOR gate is only 1 if both inputs are 0.
Another way of saying this is that if either input is 1, the output is 0.
Let us try to make a truth table for this network:
Three cases are easy: Now the hard part:
if R and S are both 0, 2 cases: Both of these are possible.

How we use the SR Latch:
This can be used to store a single bit.

The SR Latch is an example of a sequential circuit.
Such a circuit can have outputs that depend on the past history of its inputs.
Sequential circuits are said to have state or memory.
Combinational circuits have outputs that depend only on the current values of the inputs.
Combinational circuits do not have any memory.

A more useful device is the D Latch.
It adds a few gates to the front end of the SR latch so it cannot be in the state with R and S both 1:
The D Latch
300px-D-type_Transparent_Latch_(NOR)
Note that R and S cannot both be true.
When E is true, S=D and R=D', so Q = D.
When E is false, R=S=0, and the RS latch stores its value: changes to D do not affect Q.

Question:
Why are the inputs called D and E?
Answer:


Here is the block diagram for the D Latch:
100px-Transparent_Latch_Symbol

D Latch


Notice the pattern we are using to do our hardware design:

Today's News: April 11


The D latch is a device with an enable input.
The idea of an enable input is that the data in the latch can only change when the enable input is true.
What we really want is a clocked circuit.
This stores its input only when the clock input changes from low to high (or vice versa).
It can be made by putting two latches together.

One way to do this is to allow the first latch to change only when the clock is high, and the second to change only when it is low.
Here is a diagram of a D flip-flop that is clocked when the input becomes negative.
The D Flip Flop
220px-Negative-edge_triggered_master_slave_D_flip_flop
When the clock input becomes high, the first latch (the master) stores the value of D.
The second latch cannot change because its E input is low.
When the clock becomes low, the slave latch stores the value from the master.
As long as the clock stays low, the master cannot change, so the slave also cannot change (even though it is enabled).

By changing the position of the inverter, we can make the outputs change when the clock becomes high.
Here is a block diagram for the positive edge triggered D flip flop.
100px-D_Type_Flip-flop
When the clock input becomes high (positive edge triggered), the value of D is stored in the flip flop.
This is the building block for a multi-bit register.

D FLip Flop


The Register File

Putting putting several multi-bit registers together with a mux, we can make a register file.
Here is the block diagram of a register file:

registerfile
When the clock goes high, the valW value is stored in the register with address dstW.

At any time we can read the values of two of the registers in the register file.
The srcA input to determines which register is available at the valA outputs.
The srcB input to determines which register is available at the valB outputs.
These are independent of the clock.
If you change the read inputs, the corresponding outputs will change after some propagation delay.

Register File Design


Random access memory can be done in a similar way, but usually is done with different technology since this is expensive and uses a lot of power.
 Back to CS 3843 Notes Table of Contents
 next