1 of 18

Stimela primer

INAF-IRA Bologna July 2025

Presenter: B.V. Hugo

bhugo@sarao.ac.za

2 of 18

Stimela

Etymology

From Zulu isitimela, from English steamer.

Noun

train

3 of 18

A containerized modular workflow management system

  • Originally developed by Sphe Makhathini (now at Wits University)
  • Deploying a mixbag of radio software on systems is difficult, often with clashing dependencies --- no standard set (not even CASA/AIPS has all the tools needed for an end-to-end 3GC MeerKAT workflow). Some effort has gone into shipping standard C++ / Fortran tooling libraries for Ubuntu LTS (KERN suite), but the LTS release cycle is typically too slow for many common Python-based tooling and difficult to reproduce between LTS versions.
  • Goal was to make radio astronomy workflows reproducible by shipping a standard library of versioned images for commonly used software, along with a scripting language to combine them into a singular workflow (similar to CASA, but more flexible to accommodate non-NRAO tooling).
  • Original in use in e.g. CARACal - a good first step, but there was issues with the original incarnation:
    • No branching logic inside a recipe. If a step (referred to as a “cab”) was to be run only based on the output of another step (say selection of a calibration steps based off tags in a dataset as an example), one would have to run multiple sub-recipes with some Python logic in-between. HARD TO READ!
    • No looping logic
    • Input and output names was hard-coded using Pythonic logic -- easy to make mistakes, hard to develop a many step script!

4 of 18

Core elements of Stimela

  • A yaml-based workflow management tool (similar to CWL - but tailored to the nitty gritty of radio astronomy packages, including our large in-place modified databases).
  • Each workflow step executes a defined step for a particular piece of software (“cab” in Stimela-speak). The step is executed in Apptainer, or for development mode virtualenvs.
  • A standard library of curated versionned images (created with Docker - published on the quay.io container registry) and argument definition files to go along with the images. This is accompanied in a sister package called cult-cargo
  • With v2 it is possible to have multiple standard libraries. For instance Tim Molteno created a standard library for the TART telescope TART cargo
  • Supports native runtimes for SLURM and Kubernetes K8s (autoscaling workflows tested with Amazon AWS S3*)

* Software IO layers need native S3 support - MSv2 CDTS does not. New IO interop layer under active development - https://github.com/ratt-ru/xarray-ms)

5 of 18

Today’s prac

On Linux or Windows Subsystem for Linux:

pip install cult-cargo==0.2.1rc2 stimela==2.1.4rc2

pip install tart_cargo

6 of 18

Hello World!

cabs:

greeter:

flavour:

kind: python-code

interpreter_binary: python3

command: |

print(f'Hello {who}!')

info: Simple cab to greet someone in Python

inputs:

who:

dtype: str

info: Who to greet

myrecipe:

info: A simple hello-world recipe

steps:

greet:

cab: greeter

params:

who: INAF

Run:

stimela doc example1.yml

stimela doc example1.yml greeter

stimela run example1.yml myrecipe

Try out: stimela -b native run example1.yml myrecipe

7 of 18

Stimela attempts to prevalidate inputs according to cab (software) parameter definitions

Stimela handles backend, here apptainer/singularity, automatically depending on backend - could be SLURM / AWS K8s

8 of 18

Adding arguments

cabs:

...

myrecipe:

info: A simple hello-world recipe

inputs: # CMD arguments

who:

dtype: str

info: Who to greet

default: INAF

steps:

greet:

cab: greeter

params:

who: =recipe.who

Run:

stimela doc example2.yml

stimela run example2.yml myrecipe who=Italy

9 of 18

_include

  • Can include cab definitions from a separate file (recommended)
  • Final environment is a hierarchy of yaml dictionaries
  • Let’s move our cab definitions to example_cabs.yml
  • Can include multiple sub recipes /cabs this way!

_include:

- example_cabs.yml

myrecipe:

info: A simple hello-world recipe

inputs: # CMD arguments

who:

dtype: str

info: Who to greet

default: INAF

steps:

greet:

cab: greeter

params:

who: =recipe.who

Run:

stimela doc example_cabs.yml greeter

stimela run example3.yml myrecipe

10 of 18

Chaining multiple steps

  • Create a new greeter to write names into a file in our cabs definition file example_cabs.yml

filegreeter:

flavour:

kind: python-code

interpreter_binary: python3

command: |

with open(outfile, filemode) as fo:

fo.write(f'Hello {who}!')

info: Simple cab to greet someone in a file

inputs:

who:

dtype: str

info: Who to greet

filemode:

dtype: str

info: Write mode

choices:

- 'w+'

- a

default: 'w+'

outputs:

outfile:

dtype: File

info: Create a file with greeter output

default: output.txt

required: true

11 of 18

Chaining multiple steps

  • Create a new greeter to write names into a file in our cabs definition file example_cabs.yml
  • Add a printing cab definition

printer:

command: /usr/bin/cat

info: Wraps standard BASH cat utility

inputs:

filenames:

dtype: Union[File, List[File]]

must_exist: true

policies:

positional: true

repeat: list

info: Files to concatenate with standard BASH utility cat

12 of 18

Chaining multiple steps

  • Create a new greeter to write names into a file in our cabs definition file example_cabs.yml
  • Add a printing cab definition
  • Change the recipe to use the filegreeter and the printer
  • Note how we chain output to input here!
  • Other options like skip step if output exist

_include:

- example_cabs.yml

myrecipe:

info: A simple hello-world recipe

inputs: # CMD arguments

who:

dtype: str

info: Who to greet

default: INAF

steps:

greet:

cab: filegreeter

params:

who: =recipe.who

shownames:

cab: printer

params:

filenames:

- =recipe.steps.greet.outfile

Run:

stimela doc example_cabs.yml filegreeter printer

stimela run example4.yml myrecipe

13 of 18

Loops

  • Loops are all or nothing - the entire current recipe is looped over a set of variables (think how this can be applied to process spectral windows in parallel
  • Loops can be scattered to run in parallel over the node (or multiple nodes under SLURM / K8s)
  • Can split a recipe into subrecipes to scatter only a part of a recipe

Run:

stimela run example5.yml myrecipe who=\[INAF,SARAO\]

_include:

- example_cabs.yml

myrecipe:

info: A simple hello-world recipe

inputs: # CMD arguments

who:

dtype: Union[str,List[str]]

info: Who to greet

default: INAF

for_loop:

var: whoi

over: who

steps:

greet:

cab: filegreeter

params:

who: =recipe.whoi

shownames:

cab: printer

params:

filenames:

- =recipe.steps.greet.outfile

14 of 18

Branching

  • Steps support conditional skipping (‘skip’ and ‘skip_if_outputs’ conditional checking)
  • Lots of inbuild functions, see

Substitutions and formulas

_include:

- example_cabs.yml

myrecipe:

info: A simple hello-world recipe

inputs: # CMD arguments

who:

dtype: str

info: Who to greet

default: INAF

steps:

greet:

cab: greeter

params:

who: =recipe.who

skip: =recipe.who=='INAF'

greetVIP:

cab: greeter

params:

who: ='VIP '+recipe.who

skip: =recipe.who!='INAF'

Run:

stimela run example6.yml myrecipe who=INAF

stimela run example6.yml myrecipe who='Some Oak'

15 of 18

Bookkeeping configuration

  • Stimela can be configured to keep detailed logs of runs (probably one of the more useful options!)
  • Here we configure a hierarchy of logs - one set for each invocation with the last set symlinked as “logs/log”
  • Further backend options (e.g. K8s) set / imported similarly

opts:

log:

dir: logs/log-{config.run.datetime}

name: log-{info.fqname}

nest: 2

symlink: log

16 of 18

Cult-cargo

_include:

- (cultcargo):

- wsclean.yml

- casa/calibration.yml

17 of 18

Demos

Full example recipe for CASA-based reduction of Lunar data here:

https://gist.github.com/bennahugo/b6b8c4c830301a97c1ffd41fa5b7ee7e

TART telescope example here (requires installing tart_cargo using pip):

https://drive.google.com/file/d/14kheaS5O-KGAFKRHclOMoGu03MzX3TzM/view?usp=drive_link

18 of 18

Further information

Reasonably detailed documentation is available here:

https://stimela.readthedocs.io/en/latest/fundamentals/fundamentals.html