B441/E315, Fall 2018                                                        Version 2018.3

# Overview

In this lab you will design and implement a 4-bit carry-lookahead adder.  You will then combine two of these in ripple-carry fashion to create an 8-bit hybrid adder.

# Background

Adders are one of the fundamental building blocks of processor designs.  However, straightforward adder designs are not capable of high clock frequencies due to their chained latency. Therefore, we will need to explore more parallel adder implementations.

The half adder is the most basic building block of an adder circuit.  It is capable of adding two 1-bit numbers together to form both a sum (S) and carry (C) bit.  The schematic and truth table are shown below:

Here S is the sum of the 1-bit addition between A and B.  As two one bit numbers can produce a 2-bit output (ie 1 + 1 = 10 in binary), C is used to capture the “carry” or high-order bit (1), while S captures the low-order bit (0).  Carry bits get their name because of how the additional bit is “carried” to the next digit.  One example of multi-bit binary addition with carries is shown below.

Half adders generate a carry-out value, but do not have a way to handle carry-in input.  Therefore, we cannot string multiple half-adders together to build a multi-bit adder.  The Full Adder is an adaptation of the half adder that allows an additional carry-in value.  It is capable of adding 2 1-bit numbers plus a carry-in together to form a sum and a carry out.  The truth table and two possible implementations (logically equivalent) are shown below.

Note that a full adder can be built with two half adders (shown as grey boxes) plus an OR gate.

But it doesn’t have to be.  Another correct implementation looks like this:

There are several other (logically equivalent) full adder circuits.

Whatever full adder implementation you choose, we can chain multiple full adders together to perform multi-bit addition.  As an example, if we chain 4 full adders together, we can create a 4-bit adder:

Here A and B are 4-bit input values.  We first add A[0] and B[0], assuming a carry in of 0, in the right-most full adder.  We then pipe the carry-out of that full adder as the carry-in of the next adder, which adds A[1] and B[1].  This process repeats until all full adders complete.  Each one generates one sum bit that are combined to form the full 4-bit sum.  The last carry-out is used to indicate an overflow condition, ie in binary 1111 + 1111 = 11110.

The problem with ripple-carry adders is that each full adder must wait until the previous adder has generated a carry-out bit.  For small adders this delay is minor.  However, for 64-bit adders, this chain of carry-in dependencies can cause slowdowns, ie a 64-bit ripple-carry adder can’t run at 3 GHz. Carry-lookahead adders break this dependence chain by “looking-ahead” or pre-computing the carry bits.  They first compute all the carry bits, then run each full adder independently in parallel.  Conceptually, it would work like this:

However, this turns out to be easier in practice if we split apart the full adder and build a slightly different basic circuit, called a Carry Lookahead Cell (CLC):

(in other words, replace the use of a full adder with a CLC!!!)

Here we have 3 outputs:

• s: or “sum”.  This is the same as the original Full Adder example.
• g: or “generate”.  This is true if we “generate” a new carry output
• p: or “propagate”.  This is true if we “propagate” an existing carry bit through.

Generation occurs when both a and b are 1.  Thus, in binary, 1 + 1 + cin = 10 or 11 depending on if cin == 1 or cin == 0.  Either value of cin generates a carry out of 1.  Thus the addition “generates” a carry out.

Propagation occurs when either a or b is 1 (but not both).  Thus, in binary, 1 + 0 + cin = 10 or 01 depending if cin == 1 or cin == 0.  Thus, the carry out is either 1 when cin == 1, or 0 when cin == 0.  Thus the carry bit is said to “propagate” or pass through.

Then we can create a multi-bit adder as follows.  Note this diagram flows from left to right, rather than the previous right to left.

Now each CLC block computes a single sum bit, and also computes a generate and propagate bit.  Now all we need to do is implement the “Carry Generation Logic” block at the bottom.

With some math, it is possible to show that the carry out for a given bit is equivalent to:

Or, this says the carry in bit for the next CLC block is equal to the “generation” bit OR’ed with the “propagation” bit then AND’ed with the current CLC block’s carry in.  Or it’s equivalent to generated carries OR’ed with any propagated carries.  Thus we can express each carry bit as a combination of these three signals:

However, here we still have the same dependence issue for the carry bits as the ripple carry adder.  HOWEVER, we note that C2 is a product of C1, C3 is a product of C2, etc.  Thus by substitution, we can rewrite everything in terms of only C0 as follows:

Now nothing depends on a series of carry bits (Cs) to be calculated, but instead relies only on c0 (carry-in) and propagation and generation bits.  Thus all the carry bits can be computed in parallel.  Hooray!  Notice that each successive carry bit essentially recomputes the previous carry-bits in parallel.  This requires more logic gates, but allows for faster operations.

## Verilog Arrays

One other final piece of information for this lab is the use of Verilog arrays.  Just like C, verilog lets you bundle multiple signals together to form an array.  To express an array of signals in verilog, it looks something like this:

ValueType [mostSignificantBit:leastSignificantBit] ArrayName;

For example, an 16-bit array of signals named x would be written like this:

wire[15:0] x;

You can use the standard array indices notation to access an individual signal, or use a ‘:’ to access a range of signals:

x[2]   // access wire 2 _   ____

x[5:2] //access wires 5 through 2

To group multiple signals into a single array, use the concatenation operators ‘{‘ and ‘}’. ( Here we are assuming y and z are already defined as wires.)

x[5:2]= {1,0,y,z};

More verilog array operations are available at:

http://asicguru.com/verilog/tutorial/operators/57/

# Assignment Description

The first part of this assignment, create a Verilog file named FourBitCLAdder.v which defines a module as follows:

input                 c_in,

input        [3:0]        a,

input        [3:0]        b,

output        [3:0]        sum,

output                 c_out

);

You task is to generate the correct sum and c_out values given a, b, and c_in.

NOTE: For this file, you are not allowed to use the ascii letter ‘+’ anywhere in this file.

Your next task is to create a Verilog file named top.v defined as follows:

module top(

input                 c_in,

input        [7:0]        a,

input        [7:0]        b,

output        [7:0]        sum,

output         c_out,

);

This module should instantiate two FourBitCLAdder submodules and route the low-order 4 bits of the input signals to the first adder, the high-order 4 bits to the second adder, and connect the carry between them in a ripple fashion.  Thus you will have build a hybrid adder consisting of 4 carry-lookahead sections connected through a ripple carry bit.

NOTE: For this file, you are not allowed to use the ascii letter ‘+’ anywhere in this file.

## Testbenches

For this project, you will need to create two testbenches, one for a 4-bit carry-lookahead adder, and one for the 8-bit adder.

The first testbench should be named FourBitCLAdder_tb.v and should test only the 4-bit carry-lookahead adder.  For this file, you are allowed to use the built-in addition operator (‘+’).

The second testbench should be named top_tb.v and should test your 8-bit hybrid adder. For this file, you are allowed to use the built-in addition operator (‘+’).

Remember to select “System Verilog” from the “File Type” drop-down menu.

## Constraints

You will also need to reconfigure your constraints file so use the following assignments of switches and LEDs:

Verilog Name                Pin Name                PACKAGE_PIN

a[0]                      sw[0]                        V17

a[1]                      sw[1]                        V16

a[2]                   sw[2]                        W16

a[3]                   sw[3]                        W17

a[4]                   sw[4]                        V15

a[5]                   sw[5]                        W15

a[6]                   sw[6]                        W14

a[7]                   sw[7]                        W13

b[0]                   sw[8]                        V2

b[1]                   sw[9]                        T3

b[2]                   sw[10]                T2

b[3]                   sw[11]                R3

b[4]                   sw[12]                W2

b[5]                   sw[13]                U1

b[6]                   sw[14]                T1

b[7]                   sw[15]                R2

c_in                        btnC                        U18

sum[0]                         LED[0]                U16

sum[1]                    LED[1]                E19

sum[2]                     LED[2]                U19

sum[3]                     LED[3]                V19

sum[4]                     LED[4]                W18

sum[5]                    LED[5]                U15

sum[6]                     LED[6]                U14

sum[7]                     LED[7]                V14

c_out                        LED[15]                L1

# Evaluation

The evaluation will have two steps, first submission of your source code and testbench to the autograder.  Second, you will need to synthesize your design, download it to the FPGa and do a demonstration for the TA.

Log on to https://autograder.sice.indiana.edu and submit your code as per Project 1.