1 of 16

Camcorder

monitoring CAM CCBs at runtime

2 of 16

3 of 16

Background

  • CAM - Common Access Method
  • Block storage layer for most types of storage
  • SIM - System (was SCSI) Interface Module (basically HBA)
  • Periph drivers - block layer to protocol translators
  • Periph driver send CCBs to SIM
    • SIM stores result in CCB
    • done() routine for queued CCBs.
  • tcpdump(8), usbdump(8), but no camdump(8)
    • dtrace easy for basic traces, kinda hard for more complex reporting
    • Hope to provide LeCroy-like traces to vendors

4 of 16

Why camcorder

  • To overcome limitations of dtrace
    • Dtrace can’t be used early in boot
    • Dtrace can’t be always on to a circular buffer
    • Dtrace interface a little hard to use beyond the basics
  • Easier to study and understand CAM
  • Had a few weird crashes in CAM in Netflix’s fleet
    • Flakey device and/or device connection causes panic sometimes (5/month in our fleet)
    • Existing cores didn’t tell me the history of how we got into this state
  • Wanted to use wireshark and other visualization tools
  • Can’t resist a name that’s a good pun

5 of 16

Camcorder Features

  • Code 100% optional: w/o CAM_CORDER option, no code included in kernel
  • Can optionally retain the last N (default 10,000) CCBs on an ongoing basis (useful for crash dumps)
  • Can be dumped in real time via new camdump(8) program (similar to usbdump(8)
  • Can create PCAP files of CCBs, either in real time, or the current circular buffer.

6 of 16

Credit: Simplified from Scott Long

7 of 16

Design of camcorder

Add updated diagram:

Add tap in xpt_action

Add tap in xpt_done and xpt_done_direct

Add ifnet interfaces to allow dumping via normal BPF ifnet

Create program to program BPF and print packets

8 of 16

9 of 16

10 of 16

11 of 16

Camcorder: camdump

  • camdump can be used to get data in real time, or display old dumps
  • Filter by endpoint (in progress)
  • Capture packets to PCAP file
  • Display packets on screen
  • In future: wireshark understands PCAP produced

12 of 16

Circular buffer

  • Can set the number of elements in this buffer
  • Each CAM CCB and response added to a linked list
  • Circular buffer doesn’t grow after it’s full
  • Copes with low memory by dropping front of queue, or in extreme cases, the whole message (if there’s no way to get memory and we have no entries)
  • Circular buffer can be dumped from kernel core dump to piece together recent history

13 of 16

Minor CAM Improvements

  • Full refcounting and SMR for cam_sim
    • Allows async lookup of SIM by name
    • Need to convert periph and xpt structures too
    • Lifetime issues for some SIMs currently
  • Removed sim callout
    • Nothing used it in ages
  • Removed xpt_polled_action
    • periph_runccb handles polling automatically
  • Coccinelle found issues
    • Refining scripts to commit somewhere

14 of 16

Camcorder status

  • Can collect in-kernel circular buffer of last N requests / replies
  • Can dump in-kernel circular buffer to userland
  • Can dump requests in real time
  • Needs better filtering -- can steal much from usbdump
  • Only CCB metadata transferred, not data associated with the transfers
  • Needs better formatting code (the current format is relatively raw)
  • Leaks a lot of kernel addresses (anything inside a CCB)
  • Need to register and define a proper type for PCAP file

15 of 16

Future Plans

  • Get PCAP media assigned to us
  • Add data display code to wireshark
  • Improve filtering to allow grabbing data from a single disk and/or different requests
  • Automate extraction of circular buffer from core dumps to a pcap file
  • Also allow filtering only protocol to drive to be displayed (SCSI CDBs vs CAM CCBs)
  • Push smr down into CAM’s periph and xpt drivers

16 of 16

Questions

Warner Losh

Netflix

wlosh@netflix.com