1 of 37

Computational biomolecular simulation workflows with

BioExcel building blocks

BioExcel Virtual Training

22,23,24 / 11 / 2021

Adam Hospital, IRB Barcelona, Spain

Genís Bayarri , IRB Barcelona, Spain

Lucía Fabio, IRB Barcelona, Spain

Pau Andrio, BSC Barcelona, Spain

1

bioexcel.eu

Partners

Funding

2 of 37

2

  • Session 1 (Nov 22nd, 15h-17h CET)
    • Introduction to the BioExcel Building Blocks (BioBB) library
    • Introduction to the BioExcel Cloud Portal

  • Session 2 (Nov 23rd, 15h-17h CET)
    • BioBB workflows with Jupyter Notebooks
    • GROMACS Protein MD setup workflow as example

  • Session 3 (Nov 24th, 15h-17h CET)
    • BioBB workflows with Command Line Interface
    • GROMACS Protein MD setup workflow as example

Agenda

bioexcel.eu

3 of 37

Session 3: ��BioBB workflows with Command Line Interface��GROMACS Protein MD setup workflow as example�

3

bioexcel.eu

4 of 37

4

  • Are you familiar with UNIX command line interface? (Y/N)

Think about your recent projects:

  • Did them involve loops/conditionals? (Y/N)

  • Did they imply any re-execution (changing parameters, system, etc.)? (Y/N)

  • Extend & Discuss

Questions (7)

Example: Y Y Y

Please answer in the Zoom chat panel

bioexcel.eu

5 of 37

5

BioExcel Building Blocks: CLI – from Jupyter Notebook

biobb_MDsetup_tutorial.py

bioexcel.eu

6 of 37

6

Notes:

  • Graphical cells not showing.
  • Losing interactivity

  • Gaining High Throughput (automation, repetition)

  • Modify parameters for a certain step� 🡪 modify the Python script
  • ALL files for ALL the steps are generated in the same folder

BioExcel Building Blocks: CLI – from Jupyter Notebook

bioexcel.eu

7 of 37

7

BioExcel Building Blocks: CLI - from Python+Yaml

Workflow script

  • Building blocks
  • Python code
  • Loops / conditionals
  • Global log
  • Output folders hierarchy

Workflow parameters

  • Steps Inputs / Outputs
  • Steps Dependencies
  • Steps Properties

  • Workflow inputs
  • Workflow parameters

bioexcel.eu

8 of 37

8

BioExcel Building Blocks: CLI - from Python+Yaml

bioexcel.eu

9 of 37

9

  • Global log: global log of the workflow execution.

  • Configuration object: Python object including all information parsed from the yaml file.

  • Properties: Collection of properties (tool parameters) for every step of the workflow.

  • Paths: Collection of paths (tool inputs & outputs) for every step of the workflow.
  • Workflow properties: global properties of the workflow.

  • Common step properties: properties that can be applied to all building blocks (steps).

  • Individual Steps: properties applied to a particular step of the workflow.

BioExcel Building Blocks: CLI - from Python+Yaml

bioexcel.eu

10 of 37

10

  • Global log: global log of the workflow execution. Usually created �only once at the beginning of the workflow.

log_out, log_err = file_utils.get_logs(params), where params could be:

Name of the log file, logging level [‘CRITICAL’,’ERROR’,’WARNING’,’INFO’,’DEBUG’,’NOTSET’].

  • Configuration object: Python object (dict) including all the information parsed from the yaml file. Also read only once at the beginning of the workflow.

conf = settings.ConfReader(yaml_file)

  • Properties: Collection of properties (tool parameters) for every step of the workflow, extracted from the yaml configuration file.

Paths: Collection of paths (tool inputs & outputs) for every step of the workflow, extracted from the yaml configuration file

prop = conf.get_prop_dic(params)

paths = conf.get_paths_dic()

BioExcel Building Blocks: CLI – Python concepts

bioexcel.eu

11 of 37

11

BioExcel Building Blocks: CLI – Yaml concepts

  • Workflow global / common properties: global properties of the workflow.

  • Specific step properties: specific properties applied to a particular step of the workflow.

bioexcel.eu

12 of 37

12

BioExcel Building Blocks: CLI - from Python+Yaml

bioexcel.eu

13 of 37

13

BioExcel Building Blocks: CLI – Python concepts

Global log

Configuration object

Paths & Properties

bioexcel.eu

14 of 37

14

BioExcel Building Blocks: CLI – Yaml concepts

Workflow properties

Common Steps properties

Specific Step Properties

bioexcel.eu

15 of 37

15

BioExcel Building Blocks: CLI – Joining Python & Yaml

Pdb(**paths["step1_pdb"], properties=prop["step1_pdb"])

prop = conf.get_prop_dic(params)

paths = conf.get_paths_dic()

  • Paths: **paths 🡪 unpack dictionary (Python)�from:��to: input_pdb_path=1aki.pdb,output_pdb_path=fixsidechain.pdb

  • Properties: is already a dictionary� (no need to unpack):�

bioexcel.eu

16 of 37

16

BioExcel Building Blocks: CLI - from Python+Yaml

bioexcel.eu

17 of 37

17

BioExcel Building Blocks: CLI - Executing

bioexcel.eu

18 of 37

18

  • Have you followed so far? (Y/N)

  • The division into Python and Yaml is clear? (Y/N)

  • Would you be able to start moving a workflow from Jupyter Notebook to Python + Yaml? (Y/N)

  • Extend & Discuss

Questions (8)

Example: Y Y N

Please answer in the Zoom chat panel

bioexcel.eu

19 of 37

bioexcel.eu

20 of 37

bioexcel.eu

21 of 37

21

BioExcel Building Blocks: CLI – Protein MD Setup Python

bioexcel.eu

22 of 37

22

BioExcel Building Blocks: CLI – Protein MD Setup Python

  • Imports
  • Configuration object
  • Global log
  • Paths & Properties

bioexcel.eu

23 of 37

23

  • Steps (1)

BioExcel Building Blocks: CLI – Protein MD Setup Python

bioexcel.eu

24 of 37

24

  • Steps (2)

BioExcel Building Blocks: CLI – Protein MD Setup Python

bioexcel.eu

25 of 37

25

BioExcel Building Blocks: CLI – Protein MD Setup Yaml

  • Workflow properties
  • Common step properties
  • Specific step properties

bioexcel.eu

26 of 37

26

BioExcel Building Blocks: CLI – Protein MD Setup Yaml

  • 23 steps
  • Step dependencies

bioexcel.eu

27 of 37

27

BioExcel Building Blocks: CLI – Protein MD Setup execution

bioexcel.eu

28 of 37

28

BioExcel Building Blocks: CLI – Protein MD Setup execution

bioexcel.eu

29 of 37

29

BioExcel Building Blocks: CLI – Mutations

bioexcel.eu

30 of 37

30

BioExcel Building Blocks: CLI – Mutations

bioexcel.eu

31 of 37

31

BioExcel Building Blocks: CLI – Alanine Scanning

ARG1ALA,ASP2ALA,GLY3ALA,etc.

bioexcel.eu

32 of 37

32

BioExcel Building Blocks: CLI – Alanine Scanning

bioexcel.eu

33 of 37

33

BioExcel Building Blocks: CLI – Advanced

Fast growth Thermodynamic Integration

bioexcel.eu

34 of 37

34

BioExcel Building Blocks: CLI – Advanced

  • For each mutation:
    • MD Simulations�(RBD + ACE2 + Complex)
    • Free energy calculations

Molecular Dynamics simulation data

  • Fast-growth Thermodynamic Integration
  • 1000 independent short MD simulations�(500 forward + 500 reverse)
  • GROMACS + pmx
  • Extremely parallelizable

Impact of mutations in binding affinity

bioexcel.eu

35 of 37

35

  • Have you followed so far? (Y/N)

  • Would you consider using the library to build biomolecular workflows? (Y/N)

  • Would you like to know more?

  • Extend & Discuss

Questions (9)

Example: Y Y Y

Please answer in the GoToTraining chat panel

bioexcel.eu

36 of 37

36

BioExcel Building Blocks: Much more…

bioexcel.eu

37 of 37

37

Final Comments & Suggestions

Thank you all for participating in the third BioBB Virtual Training!

http://mmb.irbbarcelona.org/biobb/

bioexcel.eu