1 of 33

Tutorial: Protein-Ligand Complex MD Setup with Jupyter Notebooks and BioBB�

Peruvian Society of Bioinformatics Virtual Symposium

04,06,08/11/2024

2 of 33

Agenda

2

  • Session 1 (Nov 4th, 15h-17h CET, 9h-11h UTC-5)
    • Introduction to the BioExcel Building Blocks (BioBB) library
    • Overview of the BioBB landing website (https://mmb.irbbarcelona.org/biobb/)

  • Session 2 (Nov 6th, 15h-17h CET, 9h-11h UTC-5)
    • BioBB workflows with Jupyter Notebooks
    • GROMACS Protein-ligand complex MD setup workflow as example

  • Session 3 (Nov 8th, 15h-17h CET, 9h-11h UTC-5)
    • BioBB workflows with Command Line Interface
    • GROMACS Protein-ligand complex MD setup workflow as example

3 of 33

BioBB Demonstration Workflows

3

  • MD setup (Protein / DNA) (AMBER / GROMACS)
  • Ligand parameterization
  • Protein-Ligand Docking
  • Free energy calculations
  • DNA helical parameters
  • Conformational Ensemble generation

4 of 33

GROMACS Protein-Ligand MD Setup

4

  • Complex: T4 Lysozyme (3HTB) with 2-propylphenol (JZ4)
  • Ligand parameterization (ACPype)
  • GROMACS MD Setup (Min + NVT eq + NPT eq + unrestrained short MD)

5 of 33

Questions (1)

5

  • Have you ever setup/run a MD simulation using GROMACS before? (Y/N)

  • Have you ever setup/run a MD simulation using other MD packages (AMBER, NAMD, DESMOND…)? (Y/N)

  • Have you ever setup/run a MD simulation of a protein-ligand complex? (Y/N)

Example: Y Y N

Please answer in the Zoom chat panel 🡪

6 of 33

MD Setup

6

Hospital et al, Adv Appl Bioinform Chem, 2015, 10:37-47 / Hospital A., Gelpí J.L. Wiley Interdisciplinary Reviews: Computational Molecular Science 2013, 3:364-377

7 of 33

Index

7

  • Part 1: Install and launch the workflow
    • Conda Environment
    • Jupyter Notebook
  • Part 2: Download PDB and fixing structure
  • Part 3: Generating topologies
    • Protein
    • Ligand
  • Part 4: Generating protein-ligand complex structure
    • Ligand Position Restraints
    • MD ready structure
  • Part 5: System Topology
  • Part 6: MD Setup & Run
  • Part 7: Trajectory (basic) analyses

8 of 33

Questions (2)

8

  • Are you familiar with Linux OS? (Y/N)

  • Have you ever used Conda packages/environments before? (Y/N)

  • Have you ever used Jupyter Notebooks before? (Y/N)

Example: Y Y N

Please answer in the Zoom chat panel 🡪

9 of 33

Part 1: Install and launch the workflow

9

10 of 33

Questions (3)

10

  • Where you able to install the CONDA environment? (Y/N)

  • Are you familiar with PDB files and their content? (Y/N)

  • Do you use graphical interfaces in your everyday work? (Y/N)

Example: Y Y N

Please answer in the Zoom chat panel 🡪

11 of 33

Part 2: Download PDB, check the structure, and generate protein topology

11

12 of 33

Questions (4)

12

  • Have you been following so far? (Y/N)

  • Have you ever parameterized a small molecule? (Y/N)

Example: Y N

Please answer in the Zoom chat panel 🡪

13 of 33

Part 3: Parameterizing small molecule (topology)

13

14 of 33

Questions (5)

14

  • Have you been following so far? (Y/N)

  • Are you familiar with the concept of restraints in MD? (Y/N)

  • Are you familiar with the concept of force field parameters? (Y/N)

Example: Y Y N

Please answer in the Zoom chat panel 🡪

15 of 33

Part 4: Generating protein-ligand complex structure

15

Ligand Restraints

Structure – Topology atom names matching

16 of 33

Questions (6)

16

  • Have you been following so far? (Y/N)

  • Have you ever setup/run a MD simulation using GROMACS? (Y/N)

  • Have you ever setup/run a MD simulation of a protein-ligand complex using GROMACS? (Y/N)

Example: Y Y N

Please answer in the Zoom chat panel 🡪

17 of 33

Part 5: Generating protein-ligand complex topology

17

18 of 33

Questions (7)

18

  • Have you been following so far? (Y/N)

  • Are you familiar with the concept of MD Setup? (Y/N)

Example: Y Y

Please answer in the Zoom chat panel 🡪

19 of 33

Part 6: MD Setup

19

20 of 33

MD Setup

20

Hospital et al, Adv Appl Bioinform Chem, 2015, 10:37-47 / Hospital A., Gelpí J.L. Wiley Interdisciplinary Reviews: Computational Molecular Science 2013, 3:364-377

21 of 33

Topology

21

  • Check Structure
    • Missing parts
    • Errors
    • Warnings

  • Select configuration
    • Chains / Models
    • Ligands
    • Metal ions

  • Add missing heavy atoms
    • Model loops?

  • Add Hydrogen atoms
    • Protonation state!

Topology

22 of 33

Solvation

22

  • Mimic biological environment
    • Aqueous solution

  • Define box type
    • Cubic
    • Octahedral

  • Define box size
    • Distance from the protein

  • Define water type
    • 3-sites, 4-sites, etc.

Solvated system

23 of 33

Ions

23

  • Neutralize system
    • Add counterions (Na+, Cl-)

  • Achieve a given ionic strength
    • Mimic biological conditions

Solvated & neutralized system

24 of 33

Energy minimization

24

  • Energy minimize the system
    • Ensure the system has no steric clashes or inappropriate geometry
    • Reaching a lower energy conformation
    • Usually local minimum

  • Different methods
    • Steepest Descent
    • Conjugate Gradient

  • Restricted to parts of the system
    • Hydrogen atoms
    • Solute / Solvent

  • Restrained atoms
    • Force constant (e.g. heavy atoms)

Minimized system

25 of 33

Equilibration

25

  • Equilibrate the system
    • Ensure system is stable

  • Usually two (or three) phases:
    • NVT – constant number of particles, volume and temperature
    • NPT – constant number of particles, pressure and temperature
  • NVT –> stabilizing temperature

  • NPT – > stabilizing pressure (and density)

Equilibrated system

26 of 33

Production

26

System Dynamics

  • Run Molecular Dynamics simulation:
    • Obtain dynamic information
    • Times ranging from hundreds of ns to a few μs (computationally expensive)

  • Collect data:
    • MD Trajectory (big files)
    • Save snapshot every certain period of time (e.g. 1ps)

  • 1μs trajectory = 1,000,000 snapshots!

27 of 33

QC & Analyses

27

System Flexibility Observables

  • Extract dynamic and flexibility properties
    • Stability of the system
    • Conformational flexibility

  • Global descriptors
    • Essential dynamics
    • Clustering

  • Local descriptors
    • Protein-Ligand interactions
    • Gate opening switches
    • Cavities, pockets, channels

28 of 33

MD Setup�BioBB Tutorial

28

  • Protein MD Setup Workflow (Basic)

  • Based on the official GROMACS tutorial: http://www.mdtutorials.com/gmx/lysozyme

  • 23 steps, including:

    • Structure checking & modelling
    • Topology generation
    • Protein neutralization (ions) & solvation
    • Energy minimization
    • System Equilibration
    • Production MD
    • Analysis
    • Visual inspection of intermediate results (3D, plots)

4 biobb modules used: io, model, md, analysis

29 of 33

Questions (8)

29

  • Have you been following so far? (Y/N)

  • Are you familiar with GROMACS analysis tools? (Y/N)

  • Are you familiar with imaging process (periodicity issues)? (Y/N)

Example: Y Y N

Please answer in the Zoom chat panel 🡪

30 of 33

Part 7: Trajectory post-processing

30

31 of 33

Final Questions

31

Example: Y Y Y Y

Please answer in the Zoom chat panel 🡪

  • Have you been able to follow the tutorial? (Y/N)

  • Did you encountered problems executing this part of the tutorial in the Jupyter Notebook? (Y/N)

  • Do you think you could start modifying this workflow? (Y/N)

  • Do you think you could start creating a new workflow? (Y/N)

32 of 33

To Play

32

  • Try with the other Demonstration Workflows
    • MD Protein-Ligand Complex setup, Ligand parameterization, Free energy, Docking, AMBER

  • With MD setup workflow:
    • Change the protein (PDB code). Try with DNA.
    • Introduce a mutation on the protein
    • Use a particular model from an NMR structure �(needs biobb_structure_utils)
    • Extend equilibration process
    • Run longer MD, extract ensemble of representative structures from the resulting trajectory (needs biobb_analysis)

33 of 33

33

Acknowledgments

BioExcel Center of Excellence, funded from the European Union’s Horizon 2020 Framework Programme for Research and Innovation under Specific Grant Agreements No. 675728, 823830, 101093290 (BioExcel-1, BioExcel-2 and BioExcel-3).

Genís Bayarri

Pau Andrio

Modesto Orozco

Josep Ll. Gelpí

Federica Battistini