You are on page 1of 10

INDIAN INSTITUTE OF TECHNOLOGY

KHARAGPUR
Department of Electronics and Electrical Communication
2012-2013

VLSI CAD LAB Report


On

Synthesis of Discrete Cosine Transformer using


Xilinx

Submitted to:Prof.Swapana Banerjee


Prof.ShantanuChattopadhyay

Submitted by:SpandanaVaidyula (09EC3214)


Group 3

Objective:Synthesize and fit a Discrete Cosine Transform algorithm to FPGA using Xilinx
Theory:
A discrete cosine transform (DCT) expresses a finite sequence of data points in terms of a
sum of cosine functions oscillating at different frequencies. Like the discrete Fourier
transform (DFT), a DCT operates on a function at a finite number of discrete data points. The
difference between a DCT and a DFT is that the former uses only cosine functions, while the
latter uses both cosines and sines (in the form of complex exponentials).
The discrete cosine transform
0, , . . . ,

}is defined by

( )=2

( );

Where,km = 1 / 2 for m = 0,
for m = 1 , 2, . . , N
andkm = 1

= 0, . . . . ,
( ) cos {(

of data sequence {

);

(2 + 1) )/2 }

Many different approaches can be used to map this DCT into hardware. This can be done
using
1. Fast algorithm based design
2. Systolic array based using matrix vector multiplication
3. Distributed aithmatic
4. ROM based designs
We have used fast algorithm based design to map this in hardware. A 8 point DCT can be
implemented by using following structure

Where circle - addition/subtraction, arrow-multiplication,

Mapping Algorithm to Hardware:


The above structure was synthesized as follows

= 1/ 2

1. A butterfly module was first implemented having three inputs and two output
outputs.
2. The butterfly module was used to define the whole structure (in main DCT module)
using the concept that the above full structure can be divided into butterflies with
different inputs and outputs.
The input and outputs are taken to be 16 bit signed numbers and for incorporating the rea
real
numbers it is assumed that the last 8 bits (i.e LSB 8 bits, bits no. -8 to -1)
1) represent values
after decimal point, next 7 bits after that (towards MSB, bits 0 to 6) represent integer part of
the number and the MSB bit represent sign value (bit no. 7).
As the design was to be fitted to the Spartan3e FPGA kit the total number of output and input
pins was limited to 232. We wanted 8 point DCT with each input and output of 16 bit the
total pins required came 256 which cannot be fitted in the required kit. Thus
hus we removed the
inputs from the main module, thus overall only 128 output pins were required. The input was
then hard coded in the main module, thus no input was required in the test bench for
simulation.
The butterfly modulecode is given below
below:
`timescale 1ns / 1ps
//////////////////////////////////////////////////////////////////////////////////
// Company:
// Engineer:
//
// Create Date: 15:36:09 01/22/2013
// Design Name:
// Module Name: butterfly
// Project Name:
// Target Devices:
// Tool versions:
// Description:
//
// Dependencies:
//
// Revision:
// Revision 0.01 - File Created
// Additional Comments:
//
//////////////////////////////////////////////////////////////////////////////////
module butterfly(
input signed [7:-8] a,
input signed [7:-8] b,
input signed [7:-8] c,
output signed [7:-8] p,
output signed [7:-8] q
);
wire signed [15:-16] m;
wire signed [7:-8] n;
assign p=a+b;
assign n=a-b;

assign m=n*c;
assign q[7]=m[15];
assign q[6:-8]=m[6:-8];
endmodule
The butterfly has three inputs, two are numbers on which operation is to be performed and
the third is a multiplication factor different for different inputs. The final output after
multiplication by multiplication factor (32 bit) is truncated to 16 bits. This is done by
removing 8 LSB bits and 8 MSB bits (preserving the MSB bit representing sign of the
number).
TheDCT modulewas made structural, it is given below
`timescale 1ns / 1ps
//////////////////////////////////////////////////////////////////////////////////
// Company:
// Engineer:
//
// Create Date: 14:39:37 01/29/2013
// Design Name:
// Module Name: DCT
// Project Name:
// Target Devices:
// Tool versions:
// Description:
//
// Dependencies:
//
// Revision:
// Revision 0.01 - File Created
// Additional Comments:
//
//////////////////////////////////////////////////////////////////////////////////
module DCT(
input signed [7:-8] x0,
// eight 16 bit inputs
input signed [7:-8] x1,
input signed [7:-8] x2,
input signed [7:-8] x3,
input signed [7:-8] x4,
input signed [7:-8] x5,
input signed [7:-8] x6,
input signed [7:-8] x7,
output signed [7:-8] y0,
// eight 16 bit outputs
output signed [7:-8] y1,
output signed [7:-8] y2,
output signed [7:-8] y3,
output signed [7:-8] y4,
output signed [7:-8] y5,
output signed [7:-8] y6,
output signed [7:-8] y7

);
wire signed [7:-8] a0,a1,a2,a3,a4,a5,a6,a7;
// connecting wires between first and second stage
wire signed [7:-8] b0,b1,b2,b3,b4,b5,b6,b7;
// connecting wires between second and third stage
wire signed [7:-8] d0,d1,d2,d3,d4,d5,d6,d7;
// connecting wires between third and fourth stage
wire signed [7:-8] c01,c02,c04,c05,c09,c10,c13;
// Coefficients for multiplication in the DCT
wire signed [7:-8] x0,x1,x2,x3,x4,x5,x6,x7;
// Inputs are taken as wires
// One set of inputs are hardcoded
assign x0 = 16'b0000000100000000;
assign x1 = 16'b0000001000000000;
assign x2 = 16'b0000001100000000;
assign x3 = 16'b0000000100000000;
assign x4 = 16'b0000001100000000;
assign x5 = 16'b0000001000000000;
assign x6 = 16'b0000000100000000;
assign x7 = 16'b0000000000000000;

// Input x0= 1
// Input x1= 2
// Input x2= 3
// Input x3= 1
// Input x4= 3
// Input x5= 2
// Input x6= 1
// Input x7= 0

// Multiplication factors are assigned values


assign c01=16'b0000000010000101;

//

assign c02=16'b0000000010001010;

//

assign c04=16'b0000000010110111;

//

assign c05=16'b0000000011100110;

//

assign c09=16'b1111110101110000;

//

assign c10=16'b1111111010110010;

//

assign c13=16'b1111111101100111;

//

// First Stage

butterfly bf0 (x0,x7,c01,a0,a4);


butterfly bf1 (x2,x5,c05,a1,a5);
butterfly bf2 (x4,x3,c09,a2,a6);
butterfly bf3 (x6,x1,c13,a3,a7);

//Second Stage
butterfly bf00 (a0,a2,c02,b0,b2);
butterfly bf01 (a1,a3,c10,b1,b3);
butterfly bf02 (a4,a6,c02,b4,b6);
butterfly bf03 (a5,a7,c10,b5,b7);

Figure: Second butterfly Stage

// Third Stage
butterfly bf000 (b0,b1,c04,d0,d1);
butterfly bf001 (b2,b3,c04,d2,d3);
butterfly bf002 (b4,b5,c04,d4,d5);
butterfly bf003 (b6,b7,c04,d6,d7);
Figure: Third butterfly Stage

// Fourth Stage
assign y0=d0;
assign y1=(d4+d6+d7);
assign y2=(d2+d3);
assign y3=d5+d6+d7;
assign y4=d1;
assign y5=d5+d7;
assign y6=d3;
assign y7=d7;
endmodule

Figure: Fourth butterfly Stage

The above program will give eight, 16 bit outputs which will be DCT of the original signal.
Test Bench:

The test benchDCT_testfor the above code is given below. The output of the test bench are
shown in the results.
`timescale 1ns / 1ps
////////////////////////////////////////////////////////////////////////////////
// Company:
// Engineer:
//
// Create Date: 19:05:30 03/30/2013
// Design Name: DCT
// Module Name: C:/Users/Jay/Documents/Xilinx/DCT_01_old/DCT_test01.v
// Project Name: DCT_01
// Target Device:
// Tool versions:
// Description:
//
// Verilog Test Fixture created by ISE for module: DCT
//
// Dependencies:
// Revision:
// Revision 0.01 - File Created
// Additional Comments:
////////////////////////////////////////////////////////////////////////////////
module DCT_test01;
// Outputs
wire [7:-8] y0;
wire [7:-8] y1;
wire [7:-8] y2;
wire [7:-8] y3;
wire [7:-8] y4;
wire [7:-8] y5;
wire [7:-8] y6;
wire [7:-8] y7;
// Instantiate the Unit Under Test (UUT)
DCT uut (
.y0(y0),
.y1(y1),
.y2(y2),
.y3(y3),
.y4(y4),
.y5(y5),
.y6(y6),
.y7(y7)
);
intial begin
end
endmodule

Results:

Simulation was done using isim simulator and results were compared with MATLAB. The
percentage deviation of the results from MATLAB results was calculated. The results are
tabulated below:
Input (x)

Xilinx Output (y)

Binary

Decimal

0000000100000000

0000001000000000

00000011
00000000
00000001
00000000
00000011
00000000
0000001000000000
00000001
00000000
00000000
00000000

3
1
3
2
1
0

Binary
00001101
00000000
00000001
11111101
11111100
01111011
00000000
10111100
11111101
11011011
11111110
00010111
00000000
10110101
00000010
01110100

Decimal

MATLAB
Output

Percentage
Error (%)

13

13

1.988

1.977

0.556

-3.519

-3.537

0.508

0.734

0.766

-4.177

-2.144

-2.121

-1.080

-1.910

-1.893

0.898

0.707

0.699

1.144

2.453

2.432

0.863

The HDL synthesis report is given below:


Macro Statistics
# Multipliers
16x16-bit multiplier
# Adders/Subtractors
16-bit adder
16-bit subtractor

: 12
: 12
: 30
: 18
: 12

The design summary is shown below, the design uses only 2% of the available 4 input LUT
and slices and 55% (128 pins) of the available pins are used.

The output window is shown below: we see that there is a maximum delay of 25ns for the
third and fifth output.

The RTL schematic of the design is shown below; it shows internal structure of one butterfly
module and positions of other butterfly modules.
modules

Conclusion:The DCT algorithm was mapped using fast algorithm and a maximum delay of
25.4 ns was obtained as shown above. 55% of the total I/O pins were used and only 2% of the
available LUTs were used. This design gives the ease of implementation with less delay.

You might also like