Lecture 11��Xilinx FPGA Memories
ECE 448 – FPGA and ASIC Design with VHDL
Required reading
ECE 448 – FPGA and ASIC Design with VHDL
Chapter 11, Xilinx Spartan-3 Specific Memory
2
Recommended reading
ECE 448 – FPGA and ASIC Design with VHDL
Google search: XAPP463
Google search: XAPP464
Google search: XST User Guide (PDF)
Google search: ISE In-Depth Tutorial
3
Memory Types
4
Memory Types
Memory
RAM
ROM
Single port
Dual port
With asynchronous
read
With synchronous
read
Memory
Memory
5
Memory Types
Memory
Distributed �(MLUT-based)
Block RAM-based�(BRAM-based)
Inferred
Instantiated
Memory
Manually
Using Core Generator
6
FPGA Distributed
Memory
7
CLB Slice
COUT
D
Q
CK
S
R
EC
D
Q
CK
R
EC
O
G4
G3
G2
G1
Look-Up
Table
Carry
&
Control
Logic
O
YB
Y
F4
F3
F2
F1
XB
X
Look-Up
Table
F5IN
BY
SR
S
Carry
&
Control
Logic
CIN
CLK
CE
SLICE
8
The Design Warrior’s Guide to FPGAs�Devices, Tools, and Flows. ISBN 0750676043�Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
Xilinx Multipurpose LUT (MLUT)
16 x 1 ROM
(logic)
9
Distributed RAM
RAM16X1S
O
D
WE
WCLK
A0
A1
A2
A3
RAM32X1S
O
D
WE
WCLK
A0
A1
A2
A3
A4
RAM16X2S
O1
D0
WE
WCLK
A0
A1
A2
A3
D1
O0
=
=
LUT
LUT
or
LUT
RAM16X1D
SPO
D
WE
WCLK
A0
A1
A2
A3
DPRA0
DPO
DPRA1
DPRA2
DPRA3
or
10
FPGA Block RAM
11
Block RAM
Block RAM
Spartan-3
Dual-Port
Block RAM
Port A
Port B
12
RAM Blocks and Multipliers in Xilinx FPGAs
The Design Warrior’s Guide to FPGAs�Devices, Tools, and Flows. ISBN 0750676043�Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
13
Spartan-3E Block RAM Amounts
14
Block RAM can have various configurations (port aspect ratios)
0
16,383
1
4,095
4
0
8,191
2
0
2047
8+1
0
1023
16+2
0
16k x 1
8k x 2
4k x 4
2k x (8+1)
1024 x (16+2)
15
Block RAM Port Aspect Ratios
16
Single-Port Block RAM
17
Dual-Port Block RAM
[pA-1:0]
[pB-1:0]
18
Inference �vs.�Instantiation
19
20
Generic
Inferred
ROM
21
Distributed ROM with asynchronous read
LIBRARY ieee;
USE ieee.std_logic_1164.all;
USE ieee.std_logic_arith.all;
Entity ROM is
generic ( w : integer := 12;
-- number of bits per ROM word
r : integer := 3);
-- 2^r = number of words in ROM
port (addr : in std_logic_vector(r-1 downto 0);
dout : out std_logic_vector(w-1 downto 0));
end ROM;
22
Distributed ROM with asynchronous read
architecture behavioral of rominfr is
type rom_type is array (2**r-1 downto 0)
of std_logic_vector (w-1 downto 0);
constant ROM_array : rom_type :=
("000011000100",
"010011010010",
"010011011011",
"011011000010",
"000011110001",
"011111010110",
"010011010000",
"111110011111");
begin
dout <= ROM_array(conv_integer(unsigned(addr)));
end behavioral;
23
Distributed ROM with asynchronous read
architecture behavioral of rominfr is
type rom_type is array (2**r-1 downto 0)
of std_logic_vector (w-1 downto 0);
constant ROM_array : rom_type :=
("0C4",
"4D2",
"4DB",
"6C2",
"0F1",
"7D6",
"4D0",
"F9F");
begin
dout <= ROM_array(conv_integer(unsigned(addr)));
end behavioral;
24
Generic
Inferred
RAM
25
Distributed versus Block RAM Inference
Examples:
More RAM coding examples in the XST Coding Guidelines.
26
Distributed RAM with asynchronous read
27
Distributed single-port RAM with asynchronous read
LIBRARY ieee;
USE ieee.std_logic_1164.all;
USE ieee.std_logic_arith.all;
entity raminfr is
generic ( w : integer := 32;
-- number of bits per RAM word
r : integer := 3);
-- 2^r = number of words in RAM
port (clk : in std_logic;
we : in std_logic;
a : in std_logic_vector(r-1 downto 0);
di : in std_logic_vector(w-1 downto 0);
do : out std_logic_vector(w-1 downto 0));
end raminfr;
28
Distributed single-port RAM with asynchronous read
architecture behavioral of raminfr is
type ram_type is array (2**r-1 downto 0)
of std_logic_vector (w-1 downto 0);
signal RAM : ram_type;
begin
process (clk)
begin
if (clk'event and clk = '1') then
if (we = '1') then
RAM(conv_integer(unsigned(a))) <= di;
end if;
end if;
end process;
do <= RAM(conv_integer(unsigned(a)));
end behavioral;
29
Report from Synthesis
Resource Usage Report for raminfr
Mapping to part: xc3s50pq208-5
Cell usage:
GND 1 use
RAM16X4S 8 uses
I/O ports: 69
I/O primitives: 68
IBUF 36 uses
OBUF 32 uses
BUFGP 1 use
I/O Register bits: 0
Register bits not including I/Os: 0 (0%)
RAM/ROM usage summary
Single Port Rams (RAM16X4S): 8
Global Clock Buffers: 1 of 8 (12%)
Mapping Summary:
Total LUTs: 32 (2%)
30
Report from Implementation
Design Summary:�Number of errors: 0�Number of warnings: 0�Logic Utilization:�Logic Distribution:� Number of occupied Slices: 16 out of 768 2%� Number of Slices containing only related logic: 16 out of 16 100%� Number of Slices containing unrelated logic: 0 out of 16 0%� *See NOTES below for an explanation of the effects of unrelated logic�Total Number of 4 input LUTs: 32 out of 1,536 2%� Number used as 16x1 RAMs: 32� Number of bonded IOBs: 69 out of 124 55%� Number of GCLKs: 1 out of 8 12%�
31
Distributed dual-port RAM with asynchronous read
32
Distributed dual-port RAM with asynchronous read
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_unsigned.all;
use ieee.std_logic_arith.all;
entity raminfr is
generic ( w : integer := 32;
-- number of bits per RAM word
r : integer := 3);
-- 2^r = number of words in RAM
port (clk : in std_logic;
we : in std_logic;
a : in std_logic_vector(r-1 downto 0);
dpra : in std_logic_vector(r-1 downto 0);
di : in std_logic_vector(w-1 downto 0);
spo : out std_logic_vector(w-1 downto 0);
dpo : out std_logic_vector(w-1 downto 0));
end raminfr;
33
Distributed dual-port RAM with asynchronous read
architecture syn of raminfr is
type ram_type is array (2**r-1 downto 0) of std_logic_vector (w-1 downto 0);
signal RAM : ram_type;
begin
process (clk)
begin
if (clk'event and clk = '1') then
if (we = '1') then
RAM(conv_integer(unsigned(a))) <= di;
end if;
end if;
end process;
spo <= RAM(conv_integer(unsigned(a)));
dpo <= RAM(conv_integer(unsigned(dpra)));
end syn;
34
Report from Synthesis
Resource Usage Report for raminfr
Mapping to part: xc3s50pq208-5
Cell usage:
GND 1 use
I/O ports: 104
I/O primitives: 103
IBUF 39 uses
OBUF 64 uses
BUFGP 1 use
I/O Register bits: 0
Register bits not including I/Os: 0 (0%)
RAM/ROM usage summary
Dual Port Rams (RAM16X1D): 32
Global Clock Buffers: 1 of 8 (12%)
Mapping Summary:
Total LUTs: 64 (4%)
35
Report from Implementation
Design Summary:
Number of errors: 0
Number of warnings: 0
Logic Utilization:
Logic Distribution:
Number of occupied Slices: 32 out of 768 4%
Number of Slices containing only related logic: 32 out of 32 100%
Number of Slices containing unrelated logic: 0 out of 32 0%
*See NOTES below for an explanation of the effects of unrelated logic
Total Number of 4 input LUTs: 64 out of 1,536 4%
Number used for Dual Port RAMs: 64
(Two LUTs used per Dual Port RAM)
Number of bonded IOBs: 104 out of 124 83%
Number of GCLKs: 1 out of 8 12%
36
Block RAM with synchronous read�in Read-First Mode
RAM
Register
37
Block RAM Waveforms – READ_FIRST mode
38
Block RAM with synchronous read �Read-First Mode
LIBRARY ieee;
USE ieee.std_logic_1164.all;
USE ieee.std_logic_arith.all;
entity raminfr is
generic ( w : integer := 32;
-- number of bits per RAM word
r : integer := 9);
-- 2^r = number of words in RAM
port (clk : in std_logic;
we : in std_logic;
en : in std_logic;
addr : in std_logic_vector(r-1 downto 0);
di : in std_logic_vector(w-1 downto 0);
do : out std_logic_vector(w-1 downto 0));
end raminfr;
39
Block RAM with synchronous read �Read First Mode - cont'd
architecture behavioral of raminfr is
type ram_type is array (2**r-1 downto 0) of
std_logic_vector (w-1 downto 0);
signal RAM : ram_type;
begin
process (clk)
begin
if (clk'event and clk = '1') then
if (en = '1') then
do <= RAM(conv_integer(unsigned(addr)));
if (we = '1') then
RAM(conv_integer(unsigned(addr))) <= di;
end if;
end if;
end if;
end process;
end behavioral;
40
Report from Synthesis
Resource Usage Report for raminfr
Mapping to part: xc3s50pq208-5
Cell usage:
GND 1 use
RAMB16_S36 1 use
VCC 1 use
I/O ports: 69
I/O primitives: 68
IBUF 36 uses
OBUF 32 uses
BUFGP 1 use
I/O Register bits: 0
Register bits not including I/Os: 0 (0%)
RAM/ROM usage summary
Block Rams : 1 of 4 (25%)
Global Clock Buffers: 1 of 8 (12%)
Mapping Summary:
Total LUTs: 0 (0%)
41
Report from Implementation
Design Summary:
Number of errors: 0
Number of warnings: 0
Logic Utilization:
Logic Distribution:
Number of Slices containing only related logic: 0 out of 0 0%
Number of Slices containing unrelated logic: 0 out of 0 0%
*See NOTES below for an explanation of the effects of unrelated logic
Number of bonded IOBs: 69 out of 124 55%
Number of Block RAMs: 1 out of 4 25%
Number of GCLKs: 1 out of 8 12%
42
Block RAM Waveforms – WRITE_FIRST mode
43
Block RAM Waveforms – NO_CHANGE mode
44
FPGA
specific memories:
Instantiation
45
Genaral template of BRAM instantiation (1)
-- Component Attribute Specification for RAMB16_{S1 | S2 | S4}
-- Should be placed after architecture declaration but before the begin
-- Put attributes, if necessary
-- Component Instantiation for RAMB16_{S1 | S2 | S4}
-- Should be placed in architecture after the begin keyword
RAMB16_{S1 | S2 | S4}_INSTANCE_NAME : RAMB16_S1
-- synthesis translate_off
generic map (
INIT => bit_value,
INIT_00 => vector_value,
INIT_01 => vector_value,
……………………………..
INIT_3F => vector_value,
SRVAL=> bit_value,
WRITE_MODE => user_WRITE_MODE)
-- synopsys translate_on
port map (DO => user_DO,
ADDR => user_ADDR,
CLK => user_CLK,
DI => user_DI,
EN => user_EN,
SSR => user_SSR,
WE => user_WE);
46
Initializing Block RAMs 1024x16
INIT_00 : BIT_VECTOR := X"014A0C0F09170A04076802A800260205002A01C5020A0917006A006800060040";
INIT_01 : BIT_VECTOR := X"000000000000000008000A1907070A1706070A020026014A0C0F03AA09170026";
INIT_02 : BIT_VECTOR := X"0000000000000000000000000000000000000000000000000000000000000000";
INIT_03 : BIT_VECTOR := X"0000000000000000000000000000000000000000000000000000000000000000";
……………………………………………………………………………………………………………………………………
INIT_3F : BIT_VECTOR := X"0000000000000000000000000000000000000000000000000000000000000000")
0000
F0
0000
F1
0000
F2
0000
F3
0000
F4
�0000
FE
0000
FF
INIT_3F
ADDRESS
0026
10
0917
11
03AA
12
0C0F
13
014A
14
0000
1E
0000
1F
INIT_01
ADDRESS
0040
00
0006
01
0068
02
006A
03
0917
04
0C0F
0E
014A
0F
INIT_00
ADDRESS
Addresses are shown in red and data corresponding to the same memory location is shown in black
ADDRESS
DATA
47
Component declaration for BRAM (2)
VHDL Instantiation Template for RAMB16_S9, S18 and S36
-- Component Declaration for RAMB16_{S9 | S18 | S36}
component RAMB16_{S9 | S18 | S36}
-- synthesis translate_off
generic (
INIT : bit_vector := X"0";
INIT_00 : bit_vector := X"0000000000000000000000000000000000000000000000000000000000000000";
INIT_3E : bit_vector := X"0000000000000000000000000000000000000000000000000000000000000000";
INIT_3F : bit_vector := X"0000000000000000000000000000000000000000000000000000000000000000";
INITP_00 : bit_vector := X"0000000000000000000000000000000000000000000000000000000000000000";
INITP_07 : bit_vector := X"0000000000000000000000000000000000000000000000000000000000000000";
SRVAL : bit_vector := X"0";
WRITE_MODE : string := "READ_FIRST"; );
48
Component declaration for BRAM (2)
-- synthesis translate_on
port (DO : out STD_LOGIC_VECTOR (31 downto 0);
DOP : out STD_LOGIC_VECTOR (3 downto 0);
ADDR : in STD_LOGIC_VECTOR (8 downto 0);
CLK : in STD_ULOGIC;
DI : in STD_LOGIC_VECTOR (31 downto 0);
DIP : in STD_LOGIC_VECTOR (3 downto 0);
EN : in STD_ULOGIC;
SSR : in STD_ULOGIC;
WE : in STD_ULOGIC);
end component;
49
Genaral template of BRAM instantiation (2)
-- Component Attribute Specification for RAMB16_{S9 | S18 | S36}
-- Component Instantiation for RAMB16_{S9 | S18 | S36}
-- Should be placed in architecture after the begin keyword
RAMB16_{S9 | S18 | S36}_INSTANCE_NAME : RAMB16_S1
-- synthesis translate_off
generic map (
INIT => bit_value,
INIT_00 => vector_value,
. . . . . . . . . .
INIT_3F => vector_value,
INITP_00 => vector_value,
……………
INITP_07 => vector_value
SRVAL => bit_value,
WRITE_MODE => user_WRITE_MODE)
-- synopsys translate_on
port map ( DO => user_DO,
DOP => user_DOP,
ADDR => user_ADDR,
CLK => user_CLK,
DI => user_DI,
DIP => user_DIP,
EN => user_EN,
SSR => user_SSR,
WE => user_WE);
50
Using
CORE
Generator
51
CORE Generator
52
CORE Generator
53