The Processor
1. An overview of the implementation of instructions.
The memory-reference instructions: load word (lw) and store
(sw).
The arithmetic-logical instructions: add, sub, and, or, slt,
etc.
The instructions: branch equal (beq) and jump (j).
For every instruction, the first two steps are identical:
- Send the program counter (PC) to the memory that contains the code and fetch the instruction from that memory.
- Read one or two registers, using fields of the instruction to select the registers to read. For the load word instruction, we need read only one register, but most other instructions require that we read two registers.
After these two steps, the actions required to complete the
instruction depend on the instruction class.
For example:
I.
All instruction classes, except jump, use the
arithmetic-logical unit (ALU) after reading the registers.
II.
The memory-reference instructions use the ALU
for an address calculation, the arithmetic-logical instructions for the
operation execution, and branches for comparison.
III.
Memory reference will need to access the memory
either to write data or read data for a load.
IV.
Arithmetic-logical instruction must write the
data from the ALU back into register.
V.
For branch
instructions, we may need to change the next instruction.
An abstract view of the implementation of the MIPS subset
showing the major functional units and and the major connection between them.
In several places, there are multiple connections going to a
particular unit as coming from two different sources. To solve this problem,
multiplexor (MUX) aka data selector is used to select from among several inputs
based on the setting.
TOP MULTIPLEXOR: Controls
what value replaces the PC (PC+4)
MIDDLE MULTIPLEXOR:
Indicate that is a branch
BOTTOM MULTIPLEXOR: To
determine whether the second ALU input is from the register.
2. Logic Design Conventions
There are two different types of logic elements:
a.
Elements that operate on data values:
Combinational Elements.
b.
Elements that operate on state: State Elements.
· Combinational Elements
Given the same input. A combinational
element always produces the same output.
· State Elements
Has at least two inputs and one output. The
required inputs are the data value to be written in the element and the clock,
which determines when the data is written. The output provides the value that
was written in an earlier clock cycle.
· Clocking Methodology
A clocking methodology defines when signals
can be read and when they can be written.
i)
Edge-triggered clocking methodology
Any values stored in a sequential logic element are updated only on a
clock edge. This allows us to read the contents of a register, sent the value
through some combinational logic, and write that register in the same clock
cycle.
ii)
Control Signal
We do not show a write control signal when a state element is written on
every active clock edge. If a state element is not updated on every clock, then
an explicit write control signal is required.
Nearly all these state and logic elements will have inputs
and outputs that are 32 bit wide since that is the width of most of the data
handled by the processor.
3. Building A Datapath
Let’s start of by looking at
the datapath elements:
·
State Element:
Program Counter (PC) is used to hold the
address of the current instruction.
Memory unit is to store the instructions
of a program and supply instructions given an address.
·
Adder:
·
This adder is combinational, can be built
from the ALU simply by wiring the control lines so that the control always
specifies an add operation.
To execute any instruction, we must start by fetching the
instruction from memory. To prepare for executing the next instruction, we must
also increment the program counter so that it points at the next instruction, 4
bytes later.
· Register File:
A register file is a collection of
registers in which any register can be read or write by specifying the number
of the register in the file. The register file contains the register state of
the machine. In addition, we will need an ALU to operate on the values read
from the registers.
Because the R-format instructions have
three register operands, we will need to read two data words fro the register
file and write one data word into the register file for each instruction.
For each data word read from register,
we need an input to the register file that specifies the register number to be
read and an output from the register file that will carry the value that has
been read from the registers.
· ALU:
The operation to be performed by ALU is
controlled with the ALU operation signal, which will be 32 bits for input and
output, 1 bit output if the result is zero.
4. Pipelining
Pipelining is an
implementation technique in which multiple instructions are overlapped in
execution.
According to the picture above, we use laundry
analogy for pipelining. This picture shows that to complete a cycle of laundry
it needs 1.5hours. To complete all four, it will be 1.5 x 4 = 6 hours of work
which is too long. Therefore, pipelining process is required.
This picture shows the process and advantage of using
pipelining. The advantage is that now the required is less than before which is
only 3 hours, half of the original.
We can turn the pipelining speedup discussion above into a
formula. If the stages are perfectly balanced, then the time between
instructions on the pipelined processor is:
Time
between instructions (nonpipelined)
Time between
instructions (pipelined) = ___________________________________________
Number of pipe stages
· Pipeline Hazards
There are situations in pipelining when the
next instruction cannot execute in the following clock cycle, these called
hazards.
1. Structural Hazards
The hardware cannot support the combination
of instructions that we want to execute in the same clock cycle.
2. Data Hazards
Occur when the pipeline must be stalled
because one step must wait for another to complete. Data hazard arise from the
dependence of one instruction on an earlier one that is still in the pipeline.
To solve this, forwarding or bypassing
is needed. This is a method of resolving a data hazard by retrieving the
missing data element from internal buffers rather than waiting for it to arrive
from programmer-visible registers or memory.
3. Control Hazards
Arising from the need to make a decision
based on the results of one instruction while others are executing.
There are two solutions to that:
· Stall
Just operate sequentially until the first
instruction is done and then repeat until you have the right formula. This
conservative option certainly works, but it is slow.
· Predict
Always to predict that branches will be
untaken. When you are right, the pipeline proceed with full speed. Only then
branches are taken does the pipeline stall.
No comments:
Post a Comment