1 of 16

Oct 31, 2018

Larry Ruckman

TID-AIR Electronics:

Overview of SURF’s Ethernet Library

2 of 16

What is “SURF”?

  • SURF = “SLAC Ultimate RTL Framework”
    • https://github.com/slaclab/surf (Released to public domain)
    • TID-AIR Electronics’ Firmware IP repository
          • Ethernet Library: (1000BASE, 10G-BASE, XAUI, IPv4, ARP, DHCP, ICMP, UDP)
          • AXI4 Library: (Crossbar, DMA, FIFO, etc.)
          • AXI4-Lite Library: (Crossbar, AXI4-to-AXI4-Lite bridge, etc.)
          • AXI4 stream Library: (DMA, MUX, FIFO, etc)
          • Device Library: (ADI, Linear, Micron, TI, etc.)
          • Synchronization Library: (Synchronize bits, buses, vectors, resets, etc)
          • Wrapped Xilinx Library: (clock managers, SEM, DNA, IPROG)
          • Serial Protocols Library: (I2C, SPI, UART, G-Link, JESD204B, etc)
          • SLAC Protocols: https://confluence.slac.stanford.edu/display/ppareg/SLAC+Protocols
  • SURF is Controlled and maintained by TID-AIR Electronics

2

3 of 16

Typical Ethernet Stack for SURF

3

PHY

MAC

IPv4

UDP

RSSI

High Speed Serial

GMII or XGMII or XLGMII

128-bit AXI Stream

128-bit AXI Stream

128-bit AXI Stream

64-bit AXI Streams

Application Firmware

AxiStreamPacketizer

64-bit AXI Stream

L1

L2

L3

L4

L5

L6

L7

Firmware Ethernet Stack

4 of 16

SURF: ETH PHY (Physical layer)

  • We use the Xilinx IP core to generate the PHY layer (not custom code)
  • We provide wrappers that combine the PHY + MAC layers together
  • We have the following wrappers implemented in the SURF’s Ethernet library:
    • 1000BASE-X
      • 1x lane (1.25 Gbps/lane)
      • GMII Interface to MAC
    • 10GBASE-KX4
      • 4x lane (3.125 Gbps/lane)
      • A.K.A. “XAUI”
      • XGMII Interface to MAC
    • 10GBASE-SR
      • 1x lane (10.3125 Gbps/lane)
      • XGMII Interface to MAC
  • We have a placeholder for the following wrappers:
    • 40GBASE-SR4 (not supported yet because not a free license from Xilinx)
      • 4x lane (10.3125 Gbps/lane)
      • A.K.A. “XLAUI”
      • XLGMII Interface to MAC

4

5 of 16

SURF: MAC (Data link layer)

  • surf/ethernet/EthMacCore/rtl/EthMacTop.vhd
  • Developed and maintained by TID-AIR
  • Design with 128-bit AXI stream interfaces for 40 GbE support
    • 128-bit @ 312.5 MHz
  • Supports the following features:
    • Checks/Calculates the Cyclic Redundancy Check (CRC)
    • MAC address filtering
      • VHDL generic enabled
    • Pause Flow control
      • VHDL generic enabled
    • Routing of non-VLAN and VLAN Ethernet frames
      • VHDL generic enabled
    • IPv4 checksum offloader
      • VHDL generic enabled
    • UDP/TCP checksum offloader (non-fragmentation only)
      • VHDL generic enabled

5

MAC RX Engine

MAC TX Engine

From PHY

To PHY

From Network layer(s)

To Network layer(s)

Pause

Flow Control

6 of 16

SURF: (PHY + MAC) Wrappers

  • We only support one type of physical layer at a time in a firmware (FW) build
    • No simultaneously supported of multiple PHY layers
  • 1000BASE-X (1GbE, 1x lane)
    • Artix-7: surf/ethernet/GigEthCore/gtp7/rtl/GigEthGtp7Wrapper.vhd
    • Kintex-7: surf/ethernet/GigEthCore/gtx7/rtl/GigEthGtx7Wrapper.vhd
    • Virtex-7: surf/ethernet/GigEthCore/gth7/rtl/GigEthGth7Wrapper.vhd
    • Kintex Ultrascale: surf/ethernet/GigEthCore/gthUltraScale/rtl/GigEthGthUltraScaleWrapper.vhd
  • 10GBASE-KX4 (10GbE, 4x lane, A.K.A. “XAUI”)
    • Kintex-7: surf/ethernet/XauiCore/gtx7/rtl/XauiGtx7Wrapper.vhd
    • Virtex-7: surf/ethernet/XauiCore/gth7/rtl/XauiGth7Wrapper.vhd
    • Kintex Ultrascale: surf/ethernet/XauiCore/gthUltraScale/rtl/XauiGthUltraScaleWrapper.vhd
  • 10GBASE-SR (10GbE, 1x lane)
    • Kintex-7: surf/ethernet/TenGigEthCore/gtx7/rtl/TenGigEthGtx7Wrapper.vhd
    • Virtex-7: surf/ethernet/TenGigEthCore/gth7/rtl/TenGigEthGth7Wrapper.vhd
    • Kintex Ultrascale: surf/ethernet/TenGigEthCore/gthUltraScale/rtl/TenGigEthGthUltraScaleWrapper.vhd
  • 40GBASE-SR4 (10GbE, 4x lane, A.K.A. “XLAUI”)
    • Kintex-7: surf/ethernet/XlauiCore/gtx7/ (placeholder for future code)
    • Virtex-7: surf/ethernet/XlauiCore/gth7/ (placeholder for future code)
    • Kintex Ultrascale: surf/ethernet/XlauiCore/gthUltraScale/ (placeholder for future code)

6

7 of 16

SURF: IPv4 (Network layer)

  • surf/ethernet/IpV4Engine/rtl/IpV4Engine.vhd
  • Developed and maintained by TID-AIR
  • Supports the following features:
    • Support both non-VLAN and VLAN Ethernet frames
      • Configured by VHDL generic to select which type (not simultaneously supported)
  • Integrated ICMP engine
    • Server only (only replies back)
    • ICMP request in FW not supported
      • FW uses ARP ping instead
    • Allows software (SW) to do a standard Ethernet “ping”
  • Integrated ARP engine
    • Handles ARP requests/replies for FW/SW
  • Multiple transport protocol layers Supported
    • Example: simultaneously UDP + TCP
    • 1x AXI stream per IPv4 protocol
    • Configured by VHDL generic

7

IPv4 RX Engine

IPv4 TX Engine

ICMP

ARP

From MAC

To MAC

Transport’s ARP Interface(s)

From Transport layer(s)

To Transport layer(s)

8 of 16

SURF: UDP (Transport layer)

  • surf/ethernet/UdpEngine/rtl/UdpEngine.vhd
  • Developed and maintained by TID-AIR
  • Port filtering/routing
  • # of server/client ports configured via VHDL generic
  • Local Port number assignments via VHDL generic
  • ARP ping interface for UDP clients
  • Integrated DHCP engine
    • VHDL generic enabled
    • Only FW DHCP client supported
      • Assumes SW DHCP server
  • We do NOT support an ARP table in our implementation
    • Only track the remote port of the last remote access
    • Only 1x connection per port

8

UDP RX Engine

UDP TX Engine

DHCP

From Network layer

To Network layer

IP Address

To

Server(s)/Client(s)

From

Server(s)/Client(s)

ARP pinging

9 of 16

SURF: (IPv4 + UDP) Wrappers

  • We provide a VHDL wrapper for implementing the IPv4 and UDP
  • surf/ethernet/UdpEngine/rtl/UdpEngineWrapper.vhd
  • AXI-Lite Interface:
    • Monitor the UDP server(s) connections
    • Configure the UDP client(s) connections
  • Implement all the interface with the UDP and IPv4

9

UDP RX Engine

UDP TX Engine

DHCP

To MAC/IPv4

To

Server(s)/Client(s)

From

Server(s)/Client(s)

IPv4 RX Engine

IPv4 TX Engine

ICMP

ARP

From MAC

To MAC

10 of 16

SURF: RSSI (Reliability Layer)

  • surf/protocols/rssi/rtl/RssiCore.vhd
  • Developed and maintained by TID-AIR
  • Reliable SLAC Streaming Interface (RSSI) engine:
    • Reliable communications layer based upon RUDP
      • Cisco implementation: refer to RFC-908 and RFC-1151
    • Handles the handshaking for re-transmission to another RSSI FW engine (or RSSI SW driver)
    • User application interface to the RSSI engine is a SSI interface.
    • “Reliable UDP” = RSSI engine + UDP engine
  • RSSI is not directly part of the Ethernet library
    • Agnostic to the network & transport layers
    • Give the ability to added reliable communication to any bi-directional link
  • Here’s the RSSI documentation (public domain) webpage:
    • https://confluence.slac.stanford.edu/x/1IyfD
    • Refer to this webpage for full details about RSSI and where we defer from RFCs
  • And of course, we have a RSSI SW driver to complement the firmware:

10

11 of 16

SURF: (RSSI + AxiStreamPacketizer) Wrapper

  • surf/axi/rtl/AxiStreamPacketizer.vhd & surf/axi/rtl/AxiStreamDepacketizer.vhd
  • AxiStreamPacketizer module:
    • Chucks up the application AXI stream to fit into smaller frame
      • Example: ETH’s 1500 MTU
    • Support up to 256 channels of access
      • Implemented via AXI stream’s TDEST field
    • Interface leaving of AXI stream frames not supported

  • surf/protocols/rssi/rtl/RssiCoreWrapper.vhd
  • Developed and maintained by TID-AIR
  • Here’s the AxiStreamPacketizer documentation (public domain) webpage:
  • RSSI does not use TDEST
    • Only 1 streaming channel in/out of the RSSI module
  • We created this wrapper because 99% of our applications need more than single channel access
    • Example:
      • TDEST[0x0] = Register Access
      • TDEST[0x1] = ASYNC Messages (SW interrupt)
      • TDEST[0xFF:0x80] = Stream Data

11

12 of 16

SURF: Dev Board Ethernet Performance Results

FW-to-FW Performance:

  • Bandwidth:
    • 9.1 Gbps @ MTU 1500 (Standard frame size)
    • 9.9 Gbps @ MTU 9000 (A.K.A. “jumbo frames”)
  • Latency: 1.5 μs
    • 64B Datagram with no back pressuring
    • RSSI = 230 ns
    • UDP/IPv4 = 150 ns
    • MAC/PHY = 370 ns
    • Mostly in PHY’s 64B66B gearbox
    • Lower latency in XAUI (8B10B encoding)

FW-to-SW Performance:

  • Bandwidth:
    • 1.4 Gbps @ MTU 1500 (Standard frame size)
    • 9.0 Gbps @ MTU 9000 (A.K.A. “jumbo frames”)

12

Includes Packetizer

13 of 16

SURF: FW-to-SW Ethernet Performance

  • Firmware sends 50kbytes @ 360 Hz messaging bursting
    • Burst 150MB/s for 0.3ms and then 0MB/s for the remaining 2.4ms
    • Implemented on Kintex Ultrascale FPGA with XAUI backplane in an ATCA package
  • Periodically sent this message for 4-days (continuous)
    • No RSSI drops or retransmissions during the entire test
  • Histogram the burst rate for each 360 Hz cycle
  • The range goes from 55 to 246 MB/s
    • With a peak around 141 MB/s (1.1Gbps)

13

14 of 16

How to get started?

  • We have a github repository of development board:
  • Instructions for building the FW/SW can be found here:
  • Common ETH stack used by all the dev. boards:
    • dev-board-examples/firmware/common/core/rtl/EthPortMapping.vhd
    • Includes everything but the (PHY + MAC) wrapper
    • (PHY + MAC) wrapper Implemented in the top-level FW block
  • 1000BASE-X (1GbE, 1x lane)
    • Artix-7: dev-board-examples/firmware/targets/XilinxAC701DevBoard/Ac701GigE/hdl/Ac701GigE.vhd
    • Kintex-7: dev-board-examples/firmware/targets/XilinxKC705DevBoard/Kc705GigE/hdl/Kc705GigE.vhd
    • Virtex-7: dev-board-examples/firmware/targets/DigilentNetFpgaSumeDevBoard/NetFpgaSumeGigE/hdl/NetFpgaSumeGigE.vhd
    • Kintex Ultrascale: dev-board-examples/firmware/targets/XilinxKCU105DevBoard/Kcu105GigE/hdl/Kcu105GigE.vhd
  • 10GBASE-KX4 (10GbE, 4x lane, A.K.A. “XAUI”)
    • Kintex-7: dev-board-examples/firmware/targets/XilinxKC705DevBoard/Kc705Xaui/hdl/Kc705Xaui.vhd
    • Virtex-7: dev-board-examples/firmware/targets/DigilentNetFpgaSumeDevBoard/NetFpgaSumeXaui/hdl/NetFpgaSumeXaui.vhd
    • Kintex Ultrascale: dev-board-examples/firmware/targets/XilinxKCU105DevBoard/Kcu105Xaui/hdl/Kcu105Xaui.vhd
  • 10GBASE-SR (10GbE, 1x lane)
    • Kintex-7: dev-board-examples/firmware/targets/XilinxKC705DevBoard/Kc705TenGigE/hdl/Kc705TenGigE.vhd
    • Virtex-7: dev-board-examples/firmware/targets/DigilentNetFpgaSumeDevBoard/NetFpgaSumeTenGigE/hdl/NetFpgaSumeTenGigE.vhd
    • Kintex Ultrascale: dev-board-examples/firmware/targets/XilinxKCU105DevBoard/Kcu105TenGigE/hdl/Kcu105TenGigE.vhd

14

15 of 16

Future Plans for SURF’s Ethernet Library

  • Complete Doxygen documentation of the firmware library
    • In progress
    • http://www.stack.nl/~dimitri/doxygen/
  • Add 40 GbE support
    • Either wait for Xilinx to make their 40GbE PMA/PCS IP core free or custom code
    • Add the XLGMII support in the MAC (fairly straightforward implementation)
  • Added Interleaved AXI stream support to RSSI

15

16 of 16

Backup Slides