1 of 12

Corundum status updates

Alex Forencich

4/24/2023

2 of 12

Agenda

  • Announcements
  • Status updates

3 of 12

Announcements

  • Next Corundum developer meeting: May 8 at 9:00 AM PDT
  • Next switch development meeting: May 1 at 9:00 AM PDT

  • Upstream Corundum git repo has been updated!

4 of 12

Status update summary

  • Bugs
    • FIFO memory inference issue (in progress)
  • Device support updates
  • New queue management preview

5 of 12

Bugs: FIFO memory inference issue

  • Seeing odd behavior on Intel devices
    • TX packets with incorrect IP layer checksums
    • HW checksum failure messages on RX
  • MLE traced the issue to FIFO between TX engine and TX checksum compute block incorrectly setting “enable” bit
  • Appears to be a Quartus tool bug related to merging pipeline registers into MLABs
    • Connecting RAM output register to logic analyzer or adding “preserve” attribute results in the bug disappearing
  • Status at Intel: “scheduled to be fixed in a future release”

6 of 12

Bugs: FIFO memory inference issue

  • Implemented workaround:
    • Bug seems to only affect FIFOs for some reason
    • Forcing FIFO RAM into M20K blocks seems to resolve issue
    • Instead of implementing this workaround upstream, all Intel targets affected by bug will get a local copy of axis_fifo with ramstyle set
    • Files can be removed when the issue is fixed in Quartus

7 of 12

Device support updates

  • Agilex F dev kit (DK-DEV-AGF014EA)
    • Added 25G and 100G designs
  • Stratix 10 DX dev kit (DK-DEV-1SDX-P-A)
    • Implemented I2C interface to transceivers and EEPROM (FPC202)
    • Added 100G design

8 of 12

Other minor changes

  • Event FIFO size increased
    • Was running in to lost events on Intel devices
  • Renamed a few things
    • Renamed Intel dev kits based on dev kit part number
    • Dropped part suffix for non-ES parts

9 of 12

New queue management preview

  • Need to revisit a few things to facilitate improved descriptor/completion management logic and other features
    • No sharing desc fetch/cpl write logic across multiple queue types
  • New queue types: QP (SQ+RQ), CQ, EQ
  • Support dynamic allocation of queues
    • TX side is straightforward, RX requires indirection table for RSS to work
  • Separate completion and event write logic
    • Completion write logic will be more complex and support batched writes, deferred event generation (coalescing)

10 of 12

Queue state storage (current)

TX/RX size

CQ/EQ size

Field

64

64

Base addr

16

16

Head ptr

16

16

Tail ptr

16

16

CQ/EQ/IRQ index

4

4

Log queue size

2

-

Log block size

-

1

Arm

-

1

Arm cont

1

1

Enable

8

8

Op index

127

127

Total

URAM is 4096 x 64, so

2 URAM = 4096 queues

11 of 12

Queue state storage (New)

QP size

CQ/EQ size

Field

52

52

Base addr (4K align)

16+16

16

Producer ptr

16+16

16

Consumer ptr

16+16

16

CQ/EQ/IRQ index

4+4

4

Log queue size

-

1

Arm

1

1

Enable

1+1

1

Active

12

12

VF index

16

-

LSO offset

187

119

Total

URAM is 4096 x 64, so:

4096 QP = 3 URAM

4096 CQ/EQ = 2 URAM

SQ+RQ rings will share same memory bock, amortizing large base address field

12 of 12

New queue control registers

  • Considering several changes:
  • Place pointers in the same DW
    • Takes up less address space
    • Can be read atomically for accurate queue depths
  • Read-only registers + written commands
    • Avoid read-modify-write – atomic, faster
    • Can support multiple ops at once (e.g. write cons ptr + rearm CQ)
  • Soft core core for config?
    • Most NIC drivers I looked at seem to use a command queue, with queue allocation managed by HW
    • Could help with HW/driver versioning