Recent Advances in ReFrame
8th EasyBuild User Meeting
Vasileios Karakasis (NVIDIA) and Theofilos Manitaras (CSCS)
April 24, 2023
Summary
2
ReFrame
8th EasyBuild User Meeting
ReFrame is a powerful framework that enables system testing and performance testing as code with unique HPC features.
3
ReFrame community
8th EasyBuild User Meeting
4
New development workflow since 4.0
8th EasyBuild User Meeting
Check also the updated contribution guide: https://github.com/reframe-hpc/reframe/wiki/contributing-to-reframe
5
Changes in ReFrame 4 – Dropped features
8th EasyBuild User Meeting
All features deprecated in 3.x versions are dropped:
More here: https://reframe-hpc.readthedocs.io/en/stable/whats_new_40.html#dropped-features-and-deprecations
6
Changes in ReFrame 4 – New features
8th EasyBuild User Meeting
Configuration can be split in multiple files
site_configuration = {
'systems': [
{
'name': 'tresa',
'descr': 'My Mac',
'hostnames': ['tresa'],
'modules_system': 'nomod',
'partitions': [
{
'name': 'default',
'scheduler': 'local',
'launcher': 'local',
'environs': ['gnu'],
}
]
}
],
'environments': [
{
'name': 'clang',
'cc': 'clang',
'cxx': 'clang++',
'ftn': '',
'target_systems': ['tresa']
},
]
}
7
Changes in ReFrame 4 – New features (cont'd)
8th EasyBuild User Meeting
8
Changes in ReFrame 4 – New features (cont'd)
8th EasyBuild User Meeting
New test naming scheme
9
- osu_allreduce_test %mpi_tasks=16 /03d6f48f @generic:default+builtin
^build_osu_benchmarks ~generic:default+builtin 'osu_binaries /5cf701b0 @generic:default+builtin
^fetch_osu_benchmarks ~generic 'osu_benchmarks /9fc7952e @generic:default+builtin
Test or fixture name
Test parameters
Test hash
Test case info
Fixture scope
Fixture variable name
Changes in ReFrame 4 – New features (cont'd)
8th EasyBuild User Meeting
10
Old but gold – System/Environment features
Extended syntax for valid_systems and valid_prog_environs (since 3.11)
'partitions': [
{
'name': 'mypart',
'environs': ['myenv', ...],
'features': ['gpu', 'ib'],
},
...
],
'environments': [
{
'name': 'myenv',
'features': ['cuda', 'mpi'],
'extras': {'mpi_kind': 'mpich'}
},
...
]
11
Example config
# AND features
valid_systems = ['+gpu +ib']
# OR features
valid_systems = ['+gpu', '+ib']
# NOT features
valid_prog_environs = ['-cuda']
# Select extras
valid_prog_environs = ['%mpi_kind=mpich']
Test syntax
More in Theo's part
Old but gold – Command-line options
8th EasyBuild User Meeting
E.g.: ci_extras = {'gitlab': {'only': {'refs': ['merge_requests']}}}
12
Old but gold – Execution modes
8th EasyBuild User Meeting
'modes': [
{
'name': 'ping_perf',
'options': [
'-c tests/ping.py',
'-S clients=1',
'-S interval=100',
'-n ping_test',
'--exec-order=uid',
'--performance-report',
'--exec-policy=serial',
'--keep-stage-files'
]
}
]
13
Example config
reframe --mode=ping_perf -r
reframe --mode=ping_perf -S clients=10 -r
reframe --mode=ping_perf -S foo=bar -r
reframe --mode=ping_perf --exec-policy=async -r
Old but gold – Programmable configuration
8th EasyBuild User Meeting
ReFrame's configuration file is essentially a Python module
14
Old but gold – Programmable configuration (cont'd)
8th EasyBuild User Meeting
from reframe.core.backends import register_launcher
from reframe.core.launchers import JobLaucher
@register_launcher('slrun')
class MySmartLauncher(JobLauncher):
def command(self, job):
return ['slrun', '-n', job.num_tasks, ...]
site_configuration = {
'systems': [
{
'name': 'my_system',
'partitions': [
{
'name': 'my_partition',
'launcher': 'slrun',
...
}
]
}
]
}
def prepend_prefix(record, extras, ignore_keys):
json_record = {}
for k, v in record.__dict__.items():
if not k.startswith('_') and
k not in ignore_keys:
json_record[f'my_{k}'] = v
return json.dumps([json_record])
site_configuration = {
'logging': [{
'handlers_perflog': [{
'type': 'httpjson',
'url': 'http://elastic_server/',
'level': 'info',
'json_formatter': prepend_prefix
}]
}]
}
15
Custom launcher
Custom record formatter
Old but gold – Dynamic test generation
The make_test() API call allows to create tests programmatically (since 3.10)
16
import reframe.core.builtins as builtins
from reframe.core.meta import make_test
def set_message(obj):
obj.executable_opts = [obj.message]
def validate(obj):
return sn.assert_found(obj.message, obj.stdout)
hello_cls = make_test(
'HelloTest', (rfm.RunOnlyRegressionTest,),
{
'valid_systems': ['*'],
'valid_prog_environs': ['*'],
'executable': 'echo',
'message': builtins.variable(str)
},
methods=[
builtins.run_before('run')(set_message),
builtins.sanity_function(validate)
]
)
class HelloTest(rfm.RunOnlyRegressionTest):
valid_systems = ['*']
valid_prog_environs = ['*']
executable = 'echo'
message = variable(str)
@run_before('run')
def set_message(self):
self.executable_opts = [self.message]
@sanity_function
def validate(self):
return sn.assert_found(self.message, self.stdout)
hello_cls = HelloTest
This is what the --distribute and --repeat options leverage internally
Domain-specific test generation using make_test
Example: Generate a series of STREAM benchmark workflows using a domain-specific spec file
17
stream_workflows:
- elem_type: 'float'
array_size: 16777216
num_iters: 10
num_cpus_per_task: 4
- elem_type: 'double'
array_size: 1048576
num_iters: 100
num_cpus_per_task: 1
- elem_type: 'double'
array_size: 16777216
num_iters: 10
thread_scaling: [1, 2, 4, 8]
- stream_test_2 %num_threads=8 %stream_binaries.elem_type=double %stream_binaries.array_size=16777216 %stream_binaries.num_iters=10 /7b20a90a
^stream_build %elem_type=double %array_size=16777216 %num_iters=10 ~tresa:default+gnu 'stream_binaries /1dd920e5
- stream_test_2 %num_threads=4 %stream_binaries.elem_type=double %stream_binaries.array_size=16777216 %stream_binaries.num_iters=10 /7cbd26d7
^stream_build %elem_type=double %array_size=16777216 %num_iters=10 ~tresa:default+gnu 'stream_binaries /1dd920e5
- stream_test_2 %num_threads=2 %stream_binaries.elem_type=double %stream_binaries.array_size=16777216 %stream_binaries.num_iters=10 /797fb1ed
^stream_build %elem_type=double %array_size=16777216 %num_iters=10 ~tresa:default+gnu 'stream_binaries /1dd920e5
<...>
Found 6 check(s)
Domain spec file
Generated tests
Domain-specific test generation using make_test
The idea
Full code at https://github.com/reframe-hpc/reframe/pull/2866.
def load_specs():
spec_file = os.getenv('STREAM_SPEC_FILE')
with open(spec_file) as fp:
return yaml.safe_load(fp)
def generate_tests(specs):
tests = []
for i, spec in enumerate(specs['stream_workflows']):
test_body = {}
thread_scaling = spec.pop('thread_scaling', None)
test_body = {
'stream_binaries': builtins.fixture(
stream.stream_build, scope='environment', variables=spec)
}
methods = []
if thread_scaling:
def _set_num_threads(test):
test.num_cpus_per_task = test.num_threads
test_body['num_threads'] = builtins.parameter(thread_scaling)
methods.append(builtins.run_after('init')(_set_num_threads))
tests.append(make_test(f'stream_test_{i}',
(stream.stream_test,), test_body, methods))
return tests
# Register the tests with the framework
for t in generate_tests(load_specs()):
rfm.simple_test(t)
18
A standard test file
Future outlook
8th EasyBuild User Meeting
We are limited in bandwidth but we are more than happy to accept your contributions!
19