UTeM student computer organization and architecture : THE PROCESSOR

The Processor

1. An overview of the implementation of instructions.

The memory-reference instructions: load word (lw) and store (sw).

The arithmetic-logical instructions: add, sub, and, or, slt, etc.

The instructions: branch equal (beq) and jump (j).

For every instruction, the first two steps are identical:

Send the program counter (PC) to the memory that contains the code and fetch the instruction from that memory.
Read one or two registers, using fields of the instruction to select the registers to read. For the load word instruction, we need read only one register, but most other instructions require that we read two registers.

After these two steps, the actions required to complete the instruction depend on the instruction class.

For example:

I. All instruction classes, except jump, use the arithmetic-logical unit (ALU) after reading the registers.

II. The memory-reference instructions use the ALU for an address calculation, the arithmetic-logical instructions for the operation execution, and branches for comparison.

III. Memory reference will need to access the memory either to write data or read data for a load.

IV. Arithmetic-logical instruction must write the data from the ALU back into register.

V. For branch instructions, we may need to change the next instruction.

An abstract view of the implementation of the MIPS subset showing the major functional units and and the major connection between them.

In several places, there are multiple connections going to a particular unit as coming from two different sources. To solve this problem, multiplexor (MUX) aka data selector is used to select from among several inputs based on the setting.

TOP MULTIPLEXOR: Controls what value replaces the PC (PC+4)

MIDDLE MULTIPLEXOR: Indicate that is a branch

BOTTOM MULTIPLEXOR: To determine whether the second ALU input is from the register.

2. Logic Design Conventions

There are two different types of logic elements:

a. Elements that operate on data values: Combinational Elements.

b. Elements that operate on state: State Elements.

· Combinational Elements

Given the same input. A combinational element always produces the same output.

· State Elements

Has at least two inputs and one output. The required inputs are the data value to be written in the element and the clock, which determines when the data is written. The output provides the value that was written in an earlier clock cycle.

· Clocking Methodology

A clocking methodology defines when signals can be read and when they can be written.

i) Edge-triggered clocking methodology

Any values stored in a sequential logic element are updated only on a clock edge. This allows us to read the contents of a register, sent the value through some combinational logic, and write that register in the same clock cycle.

ii) Control Signal

We do not show a write control signal when a state element is written on every active clock edge. If a state element is not updated on every clock, then an explicit write control signal is required.

Nearly all these state and logic elements will have inputs and outputs that are 32 bit wide since that is the width of most of the data handled by the processor.

3. Building A Datapath

Let’s start of by looking at the datapath elements:

· State Element:

Program Counter (PC) is used to hold the address of the current instruction.

Memory unit is to store the instructions of a program and supply instructions given an address.

· Adder:

· This adder is combinational, can be built from the ALU simply by wiring the control lines so that the control always specifies an add operation.

To execute any instruction, we must start by fetching the instruction from memory. To prepare for executing the next instruction, we must also increment the program counter so that it points at the next instruction, 4 bytes later.

· Register File:

A register file is a collection of registers in which any register can be read or write by specifying the number of the register in the file. The register file contains the register state of the machine. In addition, we will need an ALU to operate on the values read from the registers.

Because the R-format instructions have three register operands, we will need to read two data words fro the register file and write one data word into the register file for each instruction.

For each data word read from register, we need an input to the register file that specifies the register number to be read and an output from the register file that will carry the value that has been read from the registers.

· ALU:

The operation to be performed by ALU is controlled with the ALU operation signal, which will be 32 bits for input and output, 1 bit output if the result is zero.

4. Pipelining

Pipelining is an implementation technique in which multiple instructions are overlapped in execution.

According to the picture above, we use laundry analogy for pipelining. This picture shows that to complete a cycle of laundry it needs 1.5hours. To complete all four, it will be 1.5 x 4 = 6 hours of work which is too long. Therefore, pipelining process is required.

This picture shows the process and advantage of using pipelining. The advantage is that now the required is less than before which is only 3 hours, half of the original.

We can turn the pipelining speedup discussion above into a formula. If the stages are perfectly balanced, then the time between instructions on the pipelined processor is:

Time between instructions (nonpipelined)

Time between instructions (pipelined) = ___________________________________________

Number of pipe stages

· Pipeline Hazards

There are situations in pipelining when the next instruction cannot execute in the following clock cycle, these called hazards.

1. Structural Hazards

The hardware cannot support the combination of instructions that we want to execute in the same clock cycle.

2. Data Hazards

Occur when the pipeline must be stalled because one step must wait for another to complete. Data hazard arise from the dependence of one instruction on an earlier one that is still in the pipeline.

To solve this, forwarding or bypassing is needed. This is a method of resolving a data hazard by retrieving the missing data element from internal buffers rather than waiting for it to arrive from programmer-visible registers or memory.

3. Control Hazards

Arising from the need to make a decision based on the results of one instruction while others are executing.

There are two solutions to that:

· Stall

Just operate sequentially until the first instruction is done and then repeat until you have the right formula. This conservative option certainly works, but it is slow.

· Predict

Always to predict that branches will be untaken. When you are right, the pipeline proceed with full speed. Only then branches are taken does the pipeline stall.

UTeM student computer organization and architecture

Thursday, December 12, 2013

THE PROCESSOR