1 of 35

3rd CINI Summer School on

High Performance Computing and Emerging Technologies

June 18th 2025

Writing, Benchmarking, and Reproducibility

of HPC Research Papers

Daniele De Sensi, Sapienza University of Rome

2 of 35

3rd CINI Summer School on

High Performance Computing and Emerging Technologies

June 18th 2025

Writing, Benchmarking, and Reproducibility

of HPC Research Papers

Daniele De Sensi, Sapienza University of Rome

a.k.a.: Research Horror Stories

3 of 35

3rd CINI Summer School on

High Performance Computing and Emerging Technologies

June 18th 2025

Writing, Benchmarking, and Reproducibility

of HPC Research Papers

Daniele De Sensi, Sapienza University of Rome

4 of 35

Research

4

Review the

literature

Identify a

research problem

Design a solution to that problem

Implement and validate the solution

Write down the paper

5 of 35

(Bad) Research

5

Design a solution

Implement and validate the solution

Write down the paper

6 of 35

(Bad) Research

6

Design a solution

Implement and validate the solution

Write down the paper

The killer reviewer

  1. Unclear problem/motivation
  2. Lack of comparison with state-of-the-art approaches

Spoiler: In this talk, the reviewer is not the monster – we are.

7 of 35

(Bad) Writing

7

90% effort

10% (rushed)

effort

The killer

(and busy) reviewer

  • Reviewing is just one of the many tasks a researcher is doing in a day
  • She/he might review 5-20 papers for a given conference
  • Your paper must shine (informative, but also “nice”)
  • The reviewer will be happy if he/she learns something new
  • Try to anticipate reviewers’/readers’ questions
  • How to be a good writer? Read and review many papers, discuss them with your colleagues
  • You might solve a very important problem, but if you fail to communicate, it will not have any impact (even if the paper gets accepted)

Unclear motivation/solution/results/etc…

8 of 35

Academia Love Story

8

9 of 35

Academia Love Story

9

10 of 35

When and Where to Publish

10

Venue

vs.

11 of 35

3rd CINI Summer School on

High Performance Computing and Emerging Technologies

June 18th 2025

Writing, Benchmarking, and Reproducibility

of HPC Research Papers

Daniele De Sensi, Sapienza University of Rome

12 of 35

Some common Benchmarking Mistakes Monsters

12

References:

  • “Scientific Benchmarking of Parallel Computing Systems” (https://www.youtube.com/watch?v=HwEpXIWAWTU) – R. Belli, T. Hoefler

The cherry-picker

The straggler

The nondeterministic monster

e.g., if existing benchmark suites are used, all benchmarks should be run. If not, a reason should be presented

Computation might finish at different times on different nodes, report runtimes of each node (ideally), or the maximum across the nodes

Timings are non-deterministic, report timings across multiple runs (ideally show the distribution) and/or average plus confidence intervals

13 of 35

13

a.k.a. how to have nice/informative/useful plots

Scary Plotter

and the Dark Arts of Data Visualization

14 of 35

Choose the right tool/plot

14

Don’t use spreadsheets and their plotting tools (ok for quick analysis/debugging, but plots look bad).

Some suggestions:

  • Seaborn
  • Matplotlib
  • R
  • Bokeh
  • Plotly
  • Gnuplot (if you live in the 80s)
  • Tikz

Choose the most appropriate plot type:

https://github.com/Financial-Times/chart-doctor/tree/main/visual-vocabulary

15 of 35

How to report data distribution

15

“Scientific Benchmarking of Parallel Computing Systems” – R. Belli, T. Hoefler

16 of 35

How to report data distribution: beware of bar plots

16

“Friends Don't Let Friends Make Bad Graphs” (https://github.com/cxli233/FriendsDontLetFriends)

17 of 35

Tip #1: Report Bounds

17

Whenever possible, report the theoretical upper/lower bound, to show how far your algorithm/application/system is from the ideal performance

“An In-Depth Analysis of the Slingshot Interconnect” – D. De Sensi et al.

18 of 35

18

Tip #2: Exploit whitespace to improve readability

“Swing: Short-cutting Rings for Higher Bandwidth Allreduce” – D. De Sensi et al.

19 of 35

19

“Swing: Short-cutting Rings for Higher Bandwidth Allreduce” – D. De Sensi et al.

Tip #2: Exploit whitespace to improve readability

20 of 35

20

“Swing: Short-cutting Rings for Higher Bandwidth Allreduce” – D. De Sensi et al.

Tip #2: Exploit whitespace to improve readability

21 of 35

21

“Swing: Short-cutting Rings for Higher Bandwidth Allreduce” – D. De Sensi et al.

Tip #2: Exploit whitespace to improve readability

22 of 35

22

Tip #3: Stress important points

“Software Resource Disaggregation for HPC with Serverless Computing”  - M. Copik et al.

23 of 35

23

Tip #4: Use labels to provide additional info

24 of 35

References

24

25 of 35

3rd CINI Summer School on

High Performance Computing and Emerging Technologies

June 18th 2025

Writing, Benchmarking, and Reproducibility

of HPC Research Papers

Daniele De Sensi, Sapienza University of Rome

26 of 35

Reproducibility Crisis

26

https://www.nature.com/articles/533452a

27 of 35

Reproducibility Crisis

27

https://www.nature.com/articles/533452a

28 of 35

Artifacts

28

Source code:

  • Release the code publicly (refer to a specific version/tag). Ideally, associate a DOI (e.g., on Zenodo)
  • Document the code properly (at least provide a README on how to use it to obtain similar results to those shown in the paper)
  • Simplify its usage as much as possible (e.g., provide a container)

Document the experimental setup:

  • Hardware setup (CPU/GPU model, RAM size/type, NIC and network types, allocation)
  • Software setup (compiler version and flags, kernel and libraries versions, filesystem)
  • Datasets/Benchmarks used, how measurements have been performed

29 of 35

Functional reproducibility vs. Performance reproducibility

29

  • Functional reproducibility is reasonably easy
                  • E.g., release a notebook or a container with all the proper dependencies
                  • Some issues may arise with parallel code (e.g., floating-point operations not being associative)
                  • Also, letting parallel code running within containers usually requires more effort

30 of 35

Functional reproducibility for parallell code

30

Possible solution. Provide two different artifacts:

    • Docker-based sequential program for small datasets
    • Installation from source code for large datasets, so that it can be run on a large computing cluster.

Tip: If a large computing cluster is needed, also provide an estimation of the required compute hours

https://zenodo.org/records/6633941

https://dl.acm.org/doi/10.5555/3571885.3571899

31 of 35

Functional reproducibility vs. Performance reproducibility

31

  • In some cases, the main focus of the research is on performance (e.g., solving a well-known problem but in a more efficient way)
  • Performance reproducibility is much more difficult (in some cases even impossible)
  • E.g., how to check if the reported training time of the last Google/OpenAI/etc LLM is correct? Even if the model would be open, who could train it on 100,000 GPUs?
  • It is an issue even at smaller scale. E.g., what if the experiments in the paper were run on an H100 but I only have an A100?

32 of 35

Conclusions

32

  • Writing should not be rushed. It greatly affects the paper success (before and after acceptance)
  • Be a perfectionist: on motivations, writing, benchmarking, plotting, etc… (even on presentations, but that’s another horror story…)
  • Read, review, discuss: It helps in being more critical towards your work
  • Pay attention to benchmarking and data reporting mistakes
  • Be sure your work is reproducible

33 of 35

Conclusions

33

  • Do not take reviews personally. You will always be judged throughout your career
  • There is a bit of randomness in the review process (reviewers are humans and not perfect, like anyone else)
  • Different people might judge your work differently
  • If you never had a rejected paper, you are not trying hard enough (“if you are the smartest person in a room, then you are in the wrong room”)

Don’t be a monster

34 of 35

Questions?

34

35 of 35

35