You are on page 1of 46

Lectures 18

Designing a Central Processor Unit: The Controller: State Sequencing and Output Logic

DOC 112: Computer Hardware Lecture 18

Slide 1

Last lecture we defined the data paths:


Internal 32 bit Bus R0 c0 R1 c1 R2 c2 R3 c3 R4 c4 R5 c5 R6 c6 select s0s1s2 C c9 f5 B MPX c8 B Res ALU Cin Cout C select s4 s5 s6 IR c12 MASK MPX
c0 . . . . c14 f0 . . . f5 s0 . . . s6

32 f4 f3 select SHIFTER

MPX PC c10

s3 +1 MAR c11 Address MEMORY In Data Out MDR c14

A c7

f2 f1 f0 select A

c13

Controller

DOC 112: Computer Hardware Lecture 18

Slide 2

The instructions were also defined:

31

24 23

20 19

Opcode

Rdest

Address

DOC 112: Computer Hardware Lecture 18

Slide 3

The next job is designing the controller


E3 E2 E4 F1 The controller's state sequence looks simple enough, but there is a problem: What should the input signal(s) be?

E1 F3

F2

DOC 112: Computer Hardware Lecture 18

Slide 4

Determining the state sequences


The state sequencing depends on the instruction we are executing. For example if we are executing a STORE instruction we will branch from E2 to F1. If we are executing a LOADINDIRECT instruction we will go all the way to E4 before returning to F1 This suggests some complex sequencing logic is required
DOC 112: Computer Hardware Lecture 18 Slide 5

The controller - State diagram


We can try to get round this by designing a combinatorial circuit with one output C, that will tell us whether we continue to the next execution cycle or fetch another instruction 1 E2 1 E1 1 F3 E3 0 0 0 F2 1 0 E4 F1 IR31 IR30 . . Q2 Q1 Q0
Slide 6

DOC 112: Computer Hardware Lecture 18

De-multiplexers to the rescue


A demultiplexer can be used to decode the top eight bits of the IR, and give us an output line for each instruction. Only one output line is 1 at any time indicating the instruction being executed

DOC 112: Computer Hardware Lecture 18

Slide 7

Instructions with equivalent sequences


There are several instructions that need the same sequence of register transfers (even though the function bits may differ). We can simply implement these from the output lines of the instruction decoder: ADDS = ADD + SUBTRACT + AND + OR + XOR SHIFTS = ASL + ASR + ROR

DOC 112: Computer Hardware Lecture 18

Slide 8

We can now do our state assignments

DOC 112: Computer Hardware Lecture 18

Slide 9

De-multiplexers to the rescue


We can now use a 3-8 demultiplexer to decode our states We now have hardware lines that tell us both the state and the instruction or group of instructions. We can use these as Boolean variables in our hardware design!
DOC 112: Computer Hardware Lecture 18 Slide 10

The C input to the finite state machine


We can now simply write Boolean equations to define when the finite state machine needs to return to fetch a new instruction. For example we can go through our register transfer tables and find all the instructions that need exactly 2 execution cycles, and thus determine that the condition for returning from E2 is: (E2 (RETURN + SHIFTS + MOVE + JUMPINDIRECT))'
DOC 112: Computer Hardware Lecture 18 Slide 11

The C input to the finite state machine


If we proceed in the same way for all the states where we may branch back to F1 we get the following Boolean equation for C: C= (F3.NOP)' (E1 (SKIP+CLEAR+JUMP))' (E2 (RETURN + SHIFTS + MOVE + JUMPINDIRECT)))' (E3 (COMP+DEC+INC+COMPARE+ ADDS+STOREINDIRECT+LOAD)) Which we can easily implement with gates.

DOC 112: Computer Hardware Lecture 18

Slide 12

We continue using our standard method

DOC 112: Computer Hardware Lecture 18

Slide 13

Giving us the following Karnaugh maps

D2 = C Q2 Q1 + C Q1 Q0 D1 = C Q1 + C Q2 Q0 + Q2 Q1 Q0 D0 = Q2 Q1 Q0 + Q2 Q1 Q0 + C Q2 Q1 Q0
DOC 112: Computer Hardware Lecture 18 Slide 14

Further simplification
We can use the EOR simplification rule: D0 = Q2 Q1 Q0 + Q2 Q1 Q0 + C Q2 Q1 Q0 D0 = Q2 (Q1Q0) + C Q2 Q1 Q0 But, since we have already decoded the states, we will not bother with this

DOC 112: Computer Hardware Lecture 18

Slide 15

Further simplification
Instead we can simplify the equations using the decoded states: D2 = CQ2Q1 + CQ1Q0 D1 = CQ1 + CQ2Q0 + F1 D0 = F1 + F2 + CE3

DOC 112: Computer Hardware Lecture 18

Slide 16

The final circuit is simpler than expected!

DOC 112: Computer Hardware Lecture 18

Slide 17

Start Up
We did not check whether the circuit will be safe at start up, but it is. We will need to add extra hardware to make the processor do something particular at start up, (and maybe also on a signal from a reset button), so the design will be safe in any case.

DOC 112: Computer Hardware Lecture 18

Slide 18

The output Logic


We have now successfully designed the state sequencing logic, and all that remains is to design the output logic. Recall that the Moore machine had no connection between the inputs and the output logic. This is a safer design methodology However, for the processor we use the Mealy machine (the inputs go to the output logic)
DOC 112: Computer Hardware Lecture 18 Slide 19

The output logic of the controller


The output logic is a huge combinatorial design problem. The inputs are the states (F1, F2 etc) and the instructions (LOAD, STORE etc) which we have already decoded. The outputs are the clock controls (c0, c1, c2 etc) the arithmetic function select lines (f0, f1, etc) and the multiplexer select lines (s0, s1, etc).
DOC 112: Computer Hardware Lecture 18 Slide 20

Clock Gates
The clock gate signals c0 to c8 determine which register is loaded at each cycle. The MAR will use this typical gating circuit:

DOC 112: Computer Hardware Lecture 18

Slide 21

Gating The MAR


To determine when the MAR should be loaded we need to look through all the register transfer tables. This gives us an equation for CMAR: CMAR = F1 + E1(LOAD + STORE) + E2(LOADINDIRECT + STOREINDIRECT)

DOC 112: Computer Hardware Lecture 18

Slide 22

Using dont care states


However, the only time we need the MAR to be correct is before we we load the MDR. At other times we can load it without disturbing the execution. Thus we can simplify the equation: CMAR = F1 + E1 + E2

DOC 112: Computer Hardware Lecture 18

Slide 23

The MDR Clock


The same procedure is followed for all the other register clocks. From the register transfers we find: CMDR = F2 + E2LOAD + E3LOADINDIRECT The MDR (loaded in F2) is needed in cycle 3 by the CALL instruction, but only LOADINDIRECT uses it after E3, so we can simplify the equation to: CMDR = F2 + E2LOAD + E3
DOC 112: Computer Hardware Lecture 18 Slide 24

The Register Clocks


31 24 23 20 19 16 15 0

Opcode

Rdest

Rscr

Unused

The register to be clocked is recorded in the IR bits 20-22.The condition for any register (Rdest) to receive a clock edge is: CRdest = E4+ E3(LOAD+ADD+INC+DEC+COMP) + E2(ASL + MOVE + CALL+CALLINDIRECT) + E1CLEAR It cannot be simplified further
DOC 112: Computer Hardware Lecture 18 Slide 25

The Register clocks


A decoder is required to determine the which register is clocked. A four bit decoder is required if we expand the design to 16 registers

DOC 112: Computer Hardware Lecture 18

Slide 26

The Shifter Function


The shifter function is defined as follows.

The control bits are defined by equations: f4 = ASR+ROR f3 = ASL+ROR 00 is the default function
DOC 112: Computer Hardware Lecture 18 Slide 27

The ALU Function

f2 = E3(COMP+OR+AND) + E2(COMP+DEC) f1 = E3(SUBTRACT+COMPARE+DEC+INC+ADD+AND) + E2(COMP+DEC) f0 =E3(DEC + INC + ADD + OR) +E2(COMP+DEC)


DOC 112: Computer Hardware Lecture 18 Slide 28

The carry in bit


The default will be 0 The only place that a 1 carry is required is INCE3 Thus f5 = INCE3

DOC 112: Computer Hardware Lecture 18

Slide 29

The multiplexer selection bits


The multiplexer selections are defined as follows:

DOC 112: Computer Hardware Lecture 18

Slide 30

The internal bus selector: s6 s5 s4


First we need to look at the register transfer tables to determine when the different paths are selected. Using, for example, SPC to mean the condition when the PC is selected we find: SPC = E2(CALL+CALLINDIRECT) SALU = E1CLEAR + (E2+E3)(INC+DEC+COMP) + TWOE3 SMask = E1(LOAD+JUMP + STORE) + E3CALL SMDR = LOADE3 + E4
DOC 112: Computer Hardware Lecture 18 Slide 31

The internal bus selector: s6 s5 s4


Using the unallocated selections as dont cares we can write: s4 = SALU + SMDR s5 = SPC s6 = SMask + SMDR

DOC 112: Computer Hardware Lecture 18

Slide 32

The register selector


We can find the conditions defining the register selector from places in the register transfer tables where A or B are loaded. Sometimes the register to be selected is the source (Rsrc: bits 19-16) sometimes it is the destination (Rdest: bits 23-20), sometimes the internal bus. SRsrc = E1(INDIRECT + TWO) SBus = E2ONE We will use SRdest=(SRsrc+SBus)' (INDIRECT, ONE, and TWO are Boolean variables indicating the instruction type)
DOC 112: Computer Hardware Lecture 18 Slide 33

The register selector


The selection is done by a multiplexer, with an additional set of gates to impose the Sbus condition.

DOC 112: Computer Hardware Lecture 18

Slide 34

The PC selector
Last but not least we can get the conditions for the PC selector from the register transfer tables: s3 = F1 + E1(CALL+CALLINDIRECT)

DOC 112: Computer Hardware Lecture 18

Slide 35

How did we do?


We can now make a wiring list, buy the components from maplin and test it. The components will cost 200-300 (over twice the price of a Intel Core 2. The clock could be set at about 10KHz (A bit faster if we fabricate it on a single chip) So it looks as if we had better consider the Mark 2 version straight away.
DOC 112: Computer Hardware Lecture 18 Slide 36

Improvements
All instructions are 32 bit, but mostly the bottom 16 bits are empty. This means that we are wasting memory space and doing many more fetch cycles than we need. We could pack up the instructions on byte boundaries and introduce some multiplexing hardware to load the IR correctly.

DOC 112: Computer Hardware Lecture 18

Slide 37

More Arithmetic hardware


We have three unused inputs on the multiplexer that selects the internal bus. Additional arithmetic hardware could include:
A sixteen bit multiplier (multiply the bottom 16 bits of A and B to obtain a 32 bit result) An incrementer A decrementer
DOC 112: Computer Hardware Lecture 18 Slide 38

Other functionality
A circuit to test if the result (or internal bus) was zero would enable us to provide a SKIP_EQUAL instruction. (The software department would be very keen to have this). This would require a 32 bit OR gate and a single bit register.

DOC 112: Computer Hardware Lecture 18

Slide 39

More Multiplexers
Additional multiplexers could help us to reduce the instruction cycles of many instructions. For instance a multiplexer to select the input to B independently of A would reduce many three cycle instructions to two cycles.

DOC 112: Computer Hardware Lecture 18

Slide 40

More Data Paths


A data path from the registers to the internal bus would reduce some instructions by one cycle. This would require an additional input on the bus selector multiplexer, and so might be considered an alternative to the additional arithmetic functions already discussed.

DOC 112: Computer Hardware Lecture 18

Slide 41

Optimised Combinational logic


This is the hard part. We want to have the minimum time delays in all our combinational logic. This is partly a question of path length, but does require looking at low level transistor models to calculate the time accurately

DOC 112: Computer Hardware Lecture 18

Slide 42

And that the end of the course

DOC 112: Computer Hardware Lecture 18

Slide 43

And that the end of the course


- well nearly!

DOC 112: Computer Hardware Lecture 18

Slide 44

And that the end of the course


- well nearly!
Coursework 2 will be the first lab exercise of next term. There will be a revision session before the exam at the start of the summer term. Watch the web page for the time and venue.

DOC 112: Computer Hardware Lecture 18

Slide 45

In the meantime
Have a great christmas!

DOC 112: Computer Hardware Lecture 18

Slide 46

You might also like