1 of 19

Unit Test Automation for ASP�with Ansunit

Adam Smith (amsmith@ucsc.edu)

Assistant Professor of Computational Media

UC Santa Cruz

UN Omaha ASP Seminar, November 19, 2021

2 of 19

About Me

Adam Smith (amsmith@ucsc.edu)Assistant Professor of Computational Media�UC Santa Cruz�https://adamsmith.as/

I started using ASP (via lparse/smodels) in 2008 after seeing Oliver Ray talk about using ASP in the context of a discovery system. He said, “It’s like Prolog, but actually declarative.”

I focus on using ASP in the design of interactive media (e.g. videogames):

  • generating level designs, rulesets, character descriptions, or fictional narratives
  • verifying gameplay properties
  • ...

I also help random people on the potassco-users mailing list:

  • playlist generation for DJs
  • software package management
  • multi-robot pathfinding
  • ...

Key ASP skills: performance optimization, automated testing, public deployment, saturation encodings

3 of 19

“Quantifying over play”

I often need to make sure my formal model of a game’s rules is 100% faithful to the pre-existing games rules.

In the educational puzzle games Refraction / Refraction 2, practicing mathematical reasoning in generated puzzles should be inescapable.

https://adamsmith.as/papers/fdg2013_shortcuts.pdf

4 of 19

LightForge Games�(https://www.lightforge.gg/)

They want to dynamically generate game worlds that include rooms containing items.

At the lowest level, the generator needs to place 2D bounding boxes in space managing adjacency, overlap, and containment.

At the highest level, the generator needs to create a fictionally-plausible world where the players are capable of experiencing a compelling interactive story.

5 of 19

geometry.lp (n-dim qualitative spatial reasoning engine)

Front-end:

geo_axis(x). geo_axis(y).�geo_size(128).�geo_point(p). geo_point(q).�geo_box(b). geo_box(c).

geo_property(point_in_box(p,b)).�:- not geo_holds(point_in_box(p,b)).

Back-end (one-hot encoding):

geo_point_position(p,x,3).

Back-end (binary encoding):

geo_point_position_bit(p,x,0).�geo_point_position_bit(p,x,1).

b

c

q

p

128

y

x

0,0

Front-end predicates are intended to used to designers to describe fictional worlds.

Back-end predicates only matter to engineers integrating the generator with larger systems.

6 of 19

Why maintain multiple encodings?

One-hot encoding:

  • Good: Easy write/explain/maintain rules
  • Good: Easy to interpret text output of solver
  • Good: Easy to express small test instances
  • Bad: Poor scalability wrt world size

Binary encoding:

  • Good: Great scalability for large worlds
  • Bad: opposite of everything good for the one-hot encoding!

With complimentary benefits and drawbacks, we’d like to maintain a geometry engine with support for both encodings (and write future code that is agnostic to the low-level representation).

What will give us the discipline to keep these synchronized?

7 of 19

Unit Test Automation

Concepts:

  • Test cases: Assemble a collection of tiny problem instances that are known to be SAT/UNSAT/OPTIMAL.
  • Test suites: Hierarchically organize your instances by the specific modules they are trying to test. Allow engineers to easily run just the tests they care about quickly.
  • “Unit”: One or more .lp files, almost all of them lacking choice rules (or even integrity constraints).

Workflows:

  • Test-during-development: Every time you make a change to one of your .lp files, quickly re-run your tests to see if your change had the desired impact and did not indirectly break anything downstream.
  • Test-driven-development: Write your tests first to document how the module should work.

8 of 19

unittest https://docs.python.org/3/library/unittest.html

This Python library provides base classes and helper methods that ease defining test cases. It also provides a runtime library for executing test and reporting results.

$ python -m unittest module_name

9 of 19

Ansunit https://github.com/rndmcnlly/ansunit

Ansunit provides a domain-specific language (DSL) for organizing unit tests and an an interpreter for running just the tests you need. (It’s built on Python’s unittest library.)

$ ansunit test.yaml -m point_axis_relation

Notes:

  • The overall file syntax is YAML, and ASP-fragments are encoded in string literals or files.
  • Keywords: Definitions, Modules, Program, Expect, Arguments, and Test ...
  • Expect: SAT by default

10 of 19

Symbolic execution for ARMv7-A machine instructions

Key predicates:

sig(T,Wire,BitIndex,Value)

rtl_device(Instance,Config,WirePorts)

Example usage:

sig(T,Out,B,1) :-� rtl_device(Instance,and(Width),ports(Ain,Bin,Out));� B = 0 .. Width - 1;� sig(T,Ain,B,1);� sig(T,Bin,B,1).

11 of 19

1200* lines of ASP, mostly RTL instantiating and wiring modules

rtl_device(mdr_byte(Instance), % instance name

mux(8,2), % instance configuration

ports(inputs(mdr_byte(Instance,0), % named wires

mdr_byte(Instance,1),

mdr_byte(Instance,2),

mdr_byte(Instance,3)),

mar_byte_index(Instance),

mdr_byte(Instance))) :-

rtl_device(Instance,

arm_processor(RegsTable),

ports(Tick,PC,IR,MemAddr,MemReadData,MemWriteData,MemWen)).

rtl_device( {sub-component} ) :- rtl_device( {component} ).

*The problem is not particularly large in SLOC, but large in terms of application scale: a 10k-gate circuit being simulated for 1000 timesteps.

12 of 19

A set of related tests can share modules, even from a bit of inline helper code.

Each individual test should be as simple and obvious as possible.

A 4-bit wide and gate should produce the expected outputs.

13 of 19

A 1-bit wide instantiation of the ripple_carry device should set the carry out flag appropriately.

A 4-bit instantiation of a larger alu should also set N/Z/C/V condition flags appropriately.

14 of 19

These are machine instructions generated by a C compiler.

The fixed machine code is tested with a few different sets of values (e.g. checking proper handling of signs/overflow).

The tests.yaml for the ARM simulator project contained ~1300 lines of declarative test code (more than CPU design).

15 of 19

Beyond “Generate, Define, Test”

The entire ASP codebase for ARM simulator project has just two choice rules, here’s one: (“an unknown device generates 0/1 values nondeterministically on every bit at every time”)

1 { sig(T,Out,B,(0;1)) } 1 :-� rtl_device(Instance,unknown(Width),ports(Out));� time(T);� B = 0..Width-1.

And two integrity constraints, here’s one: (“if a time_expect device expects some value on some wire at some time, then that condition needs to be satisfied”)

:- rtl_device(Inst,time_expect(Width,Value,T),ports(In)),� not condition_satisfied(Width,Value,In,T).

16 of 19

Writing tests that run quickly

  • Factor out choice rules into a separate module.
  • Try to avoid using that module in any of your tests.
  • Instead, hand-code facts representing choices that should/shouldn’t be valid.
  • Try to make it so almost all of your tests can be full resolved by the grounder.
  • Put slow / solver-oriented test in a separate suite or put a descriptive “slow” tag in the test name so that engineers can easily run the quick tests repeatedly.

17 of 19

Unexpected benefits of testing?

It’s not all about correctness.

Writing tests can save you time as well.

  1. [Quality] Modules are better setup for code reuse because code each module is used several different ways in the test suite.
  2. [Documentation] Tests function as documentation of expected semantics for others.
  3. [Performance] Tests help preserve functionality across performance-enhancing reformulations.

18 of 19

How to get started with Ansunit

Get it from PyPI:

$ pip install ansunit

Or directly from GitHub:

https://github.com/rndmcnlly/ansunit

Then… edit the library code because of assumptions about Python or Clingo that are no longer valid in 2021.

19 of 19

If time permits...

Adam, open /Users/adam/Dropbox/enlearn-inside/arm-processor/sketch_3

$ ansunit tests.yaml -l # lists available tests

$ ansunit tests.yaml -m alu -vv # run all "alu" test verbosely