1 of 35

3rd CINI Summer School on

High Performance Computing and Emerging Technologies

June 18th 2025

Writing, Benchmarking, and Reproducibility

of HPC Research Papers

Daniele De Sensi, Sapienza University of Rome

2 of 35

3rd CINI Summer School on

High Performance Computing and Emerging Technologies

June 18th 2025

Writing, Benchmarking, and Reproducibility

of HPC Research Papers

Daniele De Sensi, Sapienza University of Rome

a.k.a.: Research Horror Stories

3 of 35

3rd CINI Summer School on

High Performance Computing and Emerging Technologies

June 18th 2025

Writing, Benchmarking, and Reproducibility

of HPC Research Papers

Daniele De Sensi, Sapienza University of Rome

4 of 35

Research

Review the

literature

Identify a

research problem

Design a solution to that problem

Implement and validate the solution

Write down the paper

5 of 35

(Bad) Research

Design a solution

Implement and validate the solution

Write down the paper

6 of 35

(Bad) Research

Design a solution

Implement and validate the solution

Write down the paper

The killer reviewer

Unclear problem/motivation
Lack of comparison with state-of-the-art approaches

Spoiler: In this talk, the reviewer is not the monster – we are.

7 of 35

(Bad) Writing

90% effort

10% (rushed)

effort

The killer

(and busy) reviewer

Reviewing is just one of the many tasks a researcher is doing in a day
She/he might review 5-20 papers for a given conference
Your paper must shine (informative, but also “nice”)
The reviewer will be happy if he/she learns something new
Try to anticipate reviewers’/readers’ questions
How to be a good writer? Read and review many papers, discuss them with your colleagues
You might solve a very important problem, but if you fail to communicate, it will not have any impact (even if the paper gets accepted)

Unclear motivation/solution/results/etc…

8 of 35

Academia Love Story

9 of 35

Academia Love Story

10 of 35

When and Where to Publish

Venue

vs.

11 of 35

3rd CINI Summer School on

High Performance Computing and Emerging Technologies

June 18th 2025

Writing, Benchmarking, and Reproducibility

of HPC Research Papers

Daniele De Sensi, Sapienza University of Rome

12 of 35

Some common Benchmarking Mistakes Monsters

References:

“Scientific Benchmarking of Parallel Computing Systems” (https://www.youtube.com/watch?v=HwEpXIWAWTU) – R. Belli, T. Hoefler

The cherry-picker

The straggler

The nondeterministic monster

e.g., if existing benchmark suites are used, all benchmarks should be run. If not, a reason should be presented

Computation might finish at different times on different nodes, report runtimes of each node (ideally), or the maximum across the nodes

Timings are non-deterministic, report timings across multiple runs (ideally show the distribution) and/or average plus confidence intervals

13 of 35

a.k.a. how to have nice/informative/useful plots

Scary Plotter

and the Dark Arts of Data Visualization

14 of 35

Choose the right tool/plot

Don’t use spreadsheets and their plotting tools (ok for quick analysis/debugging, but plots look bad).

Some suggestions:

Seaborn
Matplotlib
R
Bokeh
Plotly
Gnuplot (if you live in the 80s)
Tikz

Choose the most appropriate plot type:

https://github.com/Financial-Times/chart-doctor/tree/main/visual-vocabulary

15 of 35

How to report data distribution

“Scientific Benchmarking of Parallel Computing Systems” – R. Belli, T. Hoefler

16 of 35

How to report data distribution: beware of bar plots

“Friends Don't Let Friends Make Bad Graphs” (https://github.com/cxli233/FriendsDontLetFriends)

17 of 35

Tip #1: Report Bounds

Whenever possible, report the theoretical upper/lower bound, to show how far your algorithm/application/system is from the ideal performance

“An In-Depth Analysis of the Slingshot Interconnect” – D. De Sensi et al.

18 of 35

Tip #2: Exploit whitespace to improve readability

“Swing: Short-cutting Rings for Higher Bandwidth Allreduce” – D. De Sensi et al.

19 of 35

“Swing: Short-cutting Rings for Higher Bandwidth Allreduce” – D. De Sensi et al.

Tip #2: Exploit whitespace to improve readability

20 of 35

“Swing: Short-cutting Rings for Higher Bandwidth Allreduce” – D. De Sensi et al.

Tip #2: Exploit whitespace to improve readability

21 of 35

“Swing: Short-cutting Rings for Higher Bandwidth Allreduce” – D. De Sensi et al.

Tip #2: Exploit whitespace to improve readability

22 of 35

Tip #3: Stress important points

“Software Resource Disaggregation for HPC with Serverless Computing” - M. Copik et al.

23 of 35

Tip #4: Use labels to provide additional info

24 of 35

References

“Friends Don't Let Friends Make Bad Graphs” (https://github.com/cxli233/FriendsDontLetFriends)
“How to Spot Visualization Lies” (https://flowingdata.com/2017/02/09/how-to-spot-visualization-lies/)

25 of 35

3rd CINI Summer School on

High Performance Computing and Emerging Technologies

June 18th 2025

Writing, Benchmarking, and Reproducibility

of HPC Research Papers

Daniele De Sensi, Sapienza University of Rome

26 of 35

Reproducibility Crisis

https://www.nature.com/articles/533452a

27 of 35

Reproducibility Crisis

https://www.nature.com/articles/533452a

28 of 35

Artifacts

Source code:

Release the code publicly (refer to a specific version/tag). Ideally, associate a DOI (e.g., on Zenodo)
Document the code properly (at least provide a README on how to use it to obtain similar results to those shown in the paper)
Simplify its usage as much as possible (e.g., provide a container)

Document the experimental setup:

Hardware setup (CPU/GPU model, RAM size/type, NIC and network types, allocation)
Software setup (compiler version and flags, kernel and libraries versions, filesystem)
Datasets/Benchmarks used, how measurements have been performed

Artifact example: https://zenodo.org/records/6633941

SC artifact rules: https://sc24.supercomputing.org/program/papers/reproducibility-appendices-badges/

ICPP artifact rules: https://icpp2024.org/index.php?option=com_content&view=article&id=4&Itemid=108

HPCA artifact rules: https://www.hpca-conf.org/2024/submit/artifacts.php

29 of 35

Functional reproducibility vs. Performance reproducibility

Functional reproducibility is reasonably easy

E.g., release a notebook or a container with all the proper dependencies
Some issues may arise with parallel code (e.g., floating-point operations not being associative)
Also, letting parallel code running within containers usually requires more effort

30 of 35

Functional reproducibility for parallell code

Possible solution. Provide two different artifacts:

Docker-based sequential program for small datasets
Installation from source code for large datasets, so that it can be run on a large computing cluster.

Tip: If a large computing cluster is needed, also provide an estimation of the required compute hours

https://zenodo.org/records/6633941

https://dl.acm.org/doi/10.5555/3571885.3571899

31 of 35

Functional reproducibility vs. Performance reproducibility

In some cases, the main focus of the research is on performance (e.g., solving a well-known problem but in a more efficient way)
Performance reproducibility is much more difficult (in some cases even impossible)
E.g., how to check if the reported training time of the last Google/OpenAI/etc LLM is correct? Even if the model would be open, who could train it on 100,000 GPUs?
It is an issue even at smaller scale. E.g., what if the experiments in the paper were run on an H100 but I only have an A100?

32 of 35

Conclusions

Writing should not be rushed. It greatly affects the paper success (before and after acceptance)
Be a perfectionist: on motivations, writing, benchmarking, plotting, etc… (even on presentations, but that’s another horror story…)
Read, review, discuss: It helps in being more critical towards your work
Pay attention to benchmarking and data reporting mistakes
Be sure your work is reproducible

33 of 35

Conclusions

Do not take reviews personally. You will always be judged throughout your career
There is a bit of randomness in the review process (reviewers are humans and not perfect, like anyone else)
Different people might judge your work differently
If you never had a rejected paper, you are not trying hard enough (“if you are the smartest person in a room, then you are in the wrong room”)

Don’t be a monster

1 of 35

2 of 35

3 of 35

4 of 35

5 of 35

6 of 35

7 of 35

8 of 35

9 of 35

10 of 35

11 of 35

12 of 35

13 of 35

14 of 35

15 of 35

16 of 35

17 of 35

18 of 35

19 of 35

20 of 35

21 of 35

22 of 35

23 of 35

24 of 35

25 of 35

26 of 35

27 of 35

28 of 35

29 of 35

30 of 35

31 of 35

32 of 35

33 of 35

34 of 35

35 of 35