1 of 21

Functional Random Stimulus Generators and Their Applications

Vighnesh Iyer

2 of 21

How is it usually done?

  • Let’s build a random transaction generator for an execution unit (ALU) in Python

random.seed(1)

txns = []

for i in range(100):

alu_op = random.choice(list(ALUOP))

op1, op2 = (random.randint(0, 2**32), random.randint(0, 2**32))

txns.append((alu_op, op1, op2))

  • Set a seed, call default random API, use the values given
    • Looks OK so far
    • But we want the best DUT coverage from these 100 transactions

3 of 21

Adding Ad-Hoc Constraints

  • One of the ALUOPs is “divide”
    • We only want to generate legal transactions, op == divide -> op2 != 0
  • Interested in specific cases for each ALUOP
    • May want to test overflow for ADD/SUB

alu_op = random.choice(list(ALUOP))

if alu_op == DIVIDE:

op1, op2 = (random.randint(0, 2**32), random.randint(1, 2**32))

elif alu_op == ADD or SUB:

if random.randint(0, 1) == 0:

op1 = random.randint(0, 2**32)

op2 = random.randint(2**32 - op1, 2**33))

else:

# default distribution

4 of 21

What About Biasing?

  • We want to prioritize interesting cases
    • One-hot encoded operands, operands at limits of range
    • Different interesting cases for each op (e.g. overflow, carry)
    • Cases that depend on logic/arith operations on the ops themselves (a + b == 0)

if alu_op == ADD or SUB:

bias = random.randint(0, 100)

if bias in range(0, 50): # overflow

op1 = random.randint(0, 2**32)

op2 = random.randint(2**32 - op1, 2**33))

elif bias in range(50, 90): # limits

op1 = random.choice(0, 2**32 - 1)

op2 = random.choice(0, 2**32 - 1)

else:

# default distribution

5 of 21

Transaction to Transaction Relationships

  • Certain sequences of ALUOPs may expose interesting bugs
    • ADD, XOR: use same execution units (carry chain)
    • ADD, SUB: use same adder circuit
    • AND, NAND: use same logic reduction chain
  • Bias the next ALUOP based on the last transaction

prev_alu_op = txns[-1][0]

alu_op_bias = random.randint(0, 100)

if prev_alu_op == ADD:

if alu_op_bias in range(0, 10):

alu_op = XOR

elif alu_op_bias in range(10, 99):

alu_op = SUB

else:

alu_op = random.choice(list(ALUOP))

6 of 21

Adding Random Decision Instrumentation

  • We want to know how the generator arrived at a random decision
    • What sequence of random queries lead to alu_op == XOR?
  • Are there subtrees of the generator that weren’t explored?

def randint(start, stop) -> int:

value = random.randint(start, stop)

print(got value {} from here {}

.format(

value,

getframeinfo(stack()[magic])

))

return value

prev_alu_op == ADD

op1 == 2**32-1

op1 == 0

op1 in (0, 2**32-1)

10

20

70

alu_op == ADD

alu_op == SUB

alu_op == XOR

4

8

2

7 of 21

Adding Coverage

  • Did we test all extreme values of op1/op2 for every ALUOP?
  • Did we test every combination of one ALUOP followed by every other one?
  • Coverage of random stimulus vs generator itself

def cover(txns: List[(ALUOP, int, int)]):

op1 = [x[1] for x in txns]

extreme_values = (0 in op1, 2**32 - 1 in op1)

combos = itertools.permutations(list(ALUOP), 2)

combos = {x: False for x in combos}

# iterate with sliding window, mutate dict

print(extreme_values, combos)

8 of 21

What’s the Problem?

  • This works, so why change it?
    • Ad-hoc randomness
      • uncontrolled seed propagation
    • Mixing of concerns (coverage, instrumentation, biasing, sequences)
    • Not composable
      • Can’t test a sub-generator in isolation
      • Can’t inject fixed ‘random’ values for testing
    • Not introspectable
      • Can’t inspect or evaluate the ‘tree’ of random decisions
      • Coverage is hacky
    • Runtime/interpreter not configurable (eager evaluation)
  • Is there a better abstraction?

9 of 21

Separation of Description from Interpretation

  • Fundamental functional principle:
    • Separate description of what you want to do (data) with how it is done (interpreter)
    • Enables swapping and composition of interpreters
  • e.g. Keras

inputs = keras.Input(shape=(None, None, 3))

x = CenterCrop(height=150, width=150)(inputs)

x = Rescaling(scale=1.0 / 255)(x)

x = layers.Conv2D(filters=32, kernel_size=(3, 3), activation="relu")(x)

x = layers.MaxPooling2D(pool_size=(3, 3))(x)

x = layers.GlobalAveragePooling2D()(x)

outputs = layers.Dense(num_classes, activation="softmax")(x)

model = keras.Model(inputs=inputs, outputs=outputs)

Model (in-memory)

Summary

Compile / Train

ONNX

10 of 21

Constrained Random for Chisel

  • Constrained random as a Scala API - use Scala as metaprogramming layer
  • Avoid pitfalls of SystemVerilog (weak typing, poor metaprogramming support)
    • a < b < c won’t compile due to type mismatch
    • .rand() returns Either[Error, Bundle], forced to consider randomization failure

case class A(width: Int) extends Bundle {

val x = UInt(width.W)

val y = UInt(width.W)

}

val aProto = A(8)

val randBundles = Seq.tabulate(10){i => aProto.rand {

(b: A) => (b.x &+ b.y === (i+1).U) && (b.x =/= 0.U) && (b.y =/= 0.U)

}}

List(

Left(Unsat()),

Right(A(x=UInt<8>(1), y=UInt<8>(1))), // more bundle literals...

Right(A(x=UInt<8>(2), y=UInt<8>(8)))

)

11 of 21

Issues with Chisel Constrained Random

  • Core API:
    • Bundle().rand{ (b: Bundle) => Bool }
  • The API is eager
    • can’t chain constraints, turn constraints on/off dynamically
    • can’t add biases for unconstrained values
    • can’t specify solving order
    • staged randomness is clunky
  • Look to SV/UVM for inspiration

12 of 21

Randomization in SystemVerilog/UVM

  • Description of constraints is separated from random generation
    • Baked into language primitives
    • Supports many datatypes (logic, enum, string, arrays) and constraint types
    • Shoots for a uniform distribution if unspecified (“solve before” to resolve bidirectional issues)

class packet {

rand bit[3:0] addr;

rand bit[3:0] addr2;

constraint addr_range {

addr dist {2 :/ 5, [10:12] :/ 8}

}

constraint unq {unique{addr, addr2};}

}

initial begin

packet p;

p = new();

p.randomize();

end

13 of 21

What’s Missing?

  • Built-in metaprogramming
    • e.g. constraints for an instgen may come from an XML file
  • Ability to control the interpreter
    • The simulator understands the semantics of your random generator, but you can’t use it outside the simulator itself
  • What does interpreter control enable?
    • Coverage collection of generated stimulus
    • Meta-analysis of the random generator (randomization graph)
      • Stimulus embedding
    • Generator decision logging (i.e. instrumentation)
    • Injection of custom random sources
      • e.g. fixed values from a file for fuzzing
    • Automated constraint tuning

14 of 21

A Functional Random API

  • Define Random[T]
    • a datatype that describes how to generate a T given a source of randomness
    • a lazy container that is only “evaluated” when forced to
  • Random[T] has several primitives
    • Random[UInt].range(start, stop)
    • Random[UInt].dist(Map[Int -> Int])
    • Random[UInt].distRange(Map[Range -> Int])
    • Random[UInt].any
  • Random[T] is a monadic datatype (it can be composed using flatMap)

class Example extends Bundle { val x: UInt, val y: UInt }

val rand: Random[Example] = for {

x <- Random[UInt].range(1, 10)

y <- Random[UInt].range(x, x*2)

} yield Example(x, y)

15 of 21

A Functional Random API (Cont)

  • Random[T <: Data].constraint
    • Works like the existing constrained random Chisel API, except lazy
  • Compose Random until you reach the top-level stimulus type
    • Random[Seq[ALUTransaction]]
    • Then call .run() to unwrap the computation
  • Still a work in progress
    • Unified API to randomize hardware (UInt, Vector, Bundle) and software (Int, Seq, Map) datatypes
    • Missing a way to dynamically turn on/off constraints
      • Maybe split data hierarchy into Constraint[T] and Random[T]
    • Missing a coverage API
    • No built-in instrumentation in the interpreter yet

16 of 21

Applications

  • WIP
    • Using this API to build a TileLink traffic generator
    • Explore ways to inline a functional model to make generator decisions
  • The goal
    • Use this API to build a rv64ui instgen with functional ISA model integration
    • Derive random decision graphs from the generator
    • Use a graph embedding to train a coverage prediction model
  • References
    • Property Testing Libraries: QuickCheck (Haskell), ScalaCheck (Scala)
      • inspiration for Random[T] datatype
    • Random datatype functional implementation
      • Scala: https://github.com/NICTA/rng

17 of 21

What Do We Need

  • Staging of random generators
    • Tree-like structure of generators that depend on upstream randomness
    • Initial seed setting and deterministic seed propagation
  • Distribution biasing
  • Integration of arbitrary logic constraints
  • Allows staging of random state that affects control flow of subsequent decisions

18 of 21

Prior OSS Work

CRAVE and pyVSC

19 of 21

Randomness in Functional Programming

How is randomness done in purely functional languages?

Purely functional = no side effects, total functions, functions are referentially transparent

  • make Randomness a side effect that is composed to the top-level function
  • the IO interpreter will inject the seed and the Random datatype will chain the seed (via monadic flatmap) as new random values are computed

20 of 21

Monads and Function Composition

What is a monad?

generalize function application as a type-specific construct

e.g. Optional, Either

21 of 21

The Random Monad (in software)

scalaz example

Haskell example