Professional Documents
Culture Documents
Lookahead Adders
Joel D. Wigton, Member, IEEE, Brian M. Werst
Abstract This paper presents a method for performing equality detection on both input sources and zero detection on one of
the input sources of an existing adder, which allows reduction of circuitry. The method takes advantage of existing circuitry in
any carry look-ahead adder (CLA), and may add only a few gates off the critical path to support equality detection. Typical
implementations of equality and zero detection require dedicated NOR or XNOR/AND trees operating in parallel to the adder.
The presented method is desirable in that we can implement these detections with negligible impact on area, power, and timing
while eliminating the aforementioned NOR and AND trees completely.
Index TermsBranch Predicate Logic, Control Design, Carry Lookahead Adders, Computational Logic, Equality Detection,
Zero Detection
INTRODUCTION
OST instruction set architectures provide instructions for detecting equality on two register entries
as well as detecting when a register entry equals
zero. These are used as qualifiers for branch instructions,
whether done directly on a Reduced Instruction Set
Computer (RISC) machine or indirectly in an architecture
which supports instruction predication.
In this paper, we outline a simple method to do this
zero-detection or equality-detection on hardware that
exists in nearly all modern processors: the carry
lookahead adder (CLA) [1]. The methods presented here
are unique compared to other methods in that they do not
require the large additional tree circuitry, and they work
on fundamental principles of binary arithmetic.
BACKGROUND
A typical RISC architecture [2] might define:
beq $rs, $rt, LABEL;
and
beqz $rs, LABEL;
# branch if $rs == 0
PRINCIPLE OF OPERATION
a. Equality Detection
Consider the n-bit CLA which has input sources A
(src0) and B (src1). We will herein refer to input source
An:A0 as simply A, and similar for B. We can detect if the
input sources are equal using the following two steps:
1.
2.
1.
2.
3.
100101 (A)
+ 011010 (~B)
-------111111 (P)
Fig. 1. Equal case. Adder inputs and bit propagate signals shown.
Carry Out
carryout = 0
carryout = 1
carryout = 0
carryout = 1
State
Not Equal
Not Equal
Equal
Not Equal
b. Zero Detection
Zero detection is simpler than equality detection. Using the same CLA, we have defined our iszero instruction
to look for all zeroes on src1 (B). When we configure our
input controls in a particular way, we shall see that iszero
= carryout, and we get this result without any large NOR
tree needed.
For example:
PROOF OF OPERATION
The isequal operation can be proven by straightforward
logical deduction. First we examine the case when the
two input sources are equal. Given A, ~B, and carryin=0
as inputs to the CLA; if A=B then the generate, Gi, at any
bit position will be 0 and the propagate, Pi, at any bit position will be 1 and thus the propagate tree, P0P1Pn = 1.
With carryin=0 and no carry being generated with the Gi
at every bit position being 0, the carry out will be 0. The
propagate tree P=1 and carryout=0 are the two conditions
from the table above that are needed to detect equality.
100101 (A)
+ 011010 (~B)
-------111111 (P)
100101 (A)
+ 011010 (~B)
-------000000 (G)
Fig. 5. State of the art for equality detection, XNOR then AND tree
speedpath for 64 bits
In contrast, our method allows almost all of the circuitry we need to come from the already existing CLA.
For a CLA implementation using conditional sum outputs, the only propagate terms not already included in
the naturally-occurring AND tree are the high order bits
above the last conditional-sum carry. Depending on how
the carry tree was constructed, there may be some additional ANDing required to complete the propagate tree.
These bits can be ANDed separately and off the critical
path. The number of gates depends on the size of the
conditional-sum blocks, but in general it would be a few
extra gates. Finally, the isequal = ~carryout Pn:0 logic can
be generated using a single, two input gate for an addi-
Fig. 6. State of the art for zero detection, NOR tree speedpath for 64
bits
CONCLUSION
The method presented in this paper allows us to use
less area, generate less leakage power, and reduce CLA
input loading, which leads to a more efficient overall design that does not compromise timing. In addition, we
get these benefits practically for free, by configuring
control logic properly and adding only a handful of gates.
Operation
isequal
iszero
Gates (trees)
2n 1
n
Gates (CLA)
~6
None extra
REFERENCES
[1]
[2]
[3]
[4]