1 of 16

Machine Learning Chemistry

Final term paper presentation

Anshuman S. and Bhavay A.

2 of 16

MLC 8803

Geometry-enhanced molecular representations using X-attention

Scan the QR code for paper

3 of 16

Predicting molecular properties is one of the primal challenges in today's field of molecular science. And with the advent of Machine Intelligence; most of this work is now being done with intelligent ML models.
The ML models in sense require such representation of molecules which can deliver all the necessary information related to a molecule in machine language.
The aim of this work is to learn better molecular representation which encompasses learning both the topological and geometrical information as a unified multi-modal representation.
We further improve the molecular representation by implementing a self-supervised model, trained to learn effective molecular features during SSL training.

The need to model molecules?

Problem Statement

Molecule

Feature descriptors

Or molecular fingerprints

Machine learning

4 of 16

Previously, we have seen the results of GEM. Where no significant improvement after implementing BA graph.
In GEM, the graph embedding was generated after pooling the AB - BA graph embeddings.
In order to better capture the relationship between topology and geometry, we have to implement a cross-attention framework.

Topology and Geometry relation

Motivation

Figure:

Atom-Bond Embedding

Bond Angle-Bond Embedding

5 of 16

Geometry-based GNN (With Atom-Bond graph and Bond Angle graph)
SSL Training framework of this Geometry-based GNN model.
Implementation of Multi-modal embedding generation technique of X-attention.
Evaluation of the performance of this technique through experimentation.

Project deliverables

Major highlights of the Project

Atom-Bond graph

Bond-Bond angle graph

1,3

1,2

1,4

Property

Prediction

DL Model

Molecular representation

6 of 16

Step 1: Generate A-B and BA graph data structures.
Step 2: Pre-train the model with Zinc dataset (BA and BAB).
Step 3: With SSL Embeddings Train X-attention model.
Step 4: For downstream application generate AB and BA embeddings.
Step 5: Generate multi-modal embedding using X-attention
Step 6: Make final prediction from the material properties.

Work done

Methodology

Data (Graph) Preparation

SSL

X-attention

Downstream task

7 of 16

SSL: Contrastive learning

Step 2: Introduction to SSL

Input Graph

Node dropping

Edge perturbation

Edge attribute masking

Deep Mind

8 of 16

SSL: Contrastive learning

In the view of getting better embedding we implement our SSL learning.
The model is trained on a loss function which compares the graph embedding of original molecule and its (+)ve augmentation.
Learning objective: The goal is to score the agreement between positive pairs higher than the negative pairs with an InfoNCE loss term

Step 3: Application

9 of 16

The message passing layers of the GNN tries to learn the features of the molecule such that it can generate embeddings which match the (+)ve augmentation.
We train the SSL framework on A-B graph and B-AB graph separately, in order to obtain the pre-trained GNN layers associated with each of them
Training Dataset: Zinc dataset ; samples in train subset: 174,619 , samples in val subset: 49,891

ZINC

dataset

Pre-trained Inner block (BAB)

Pre-trained Inner block (AB)

Training procedure

SSL Pre-training

Work by Pande group Stanford

Atom-Bond graph

Bond-Bond angle graph

10 of 16

We trained the SSL model with 3 GCN layers, and hidden layers with dimension 64 for 20 epochs with the learning rate of 0.001.
The results for the loss in Training and Validation seems to saturate within our set 20 epochs
We visualised the generated latent space of the molecules for the A-B graph and the B-AB graph.
The latent space does seem provide some separation, but doesn’t look quite good. (A classification task would’ve been better as shown in previous works)

Results

SSL Pre-training

Bond-Angle | Bond graph

Atom | Bond graph

Figure: Visualisation of the principle components of the embedding space

Figure: Training and Validation loss plots

InfoNCE Loss

Epochs

Validation

Training

11 of 16

X-attention

Structure-Property Multi-Modal foundation model

It was originally introduced to embed images and their captions into a joint embedding space.
The SPMM model used to learn molecular representations.
The training setup is similar to a masked language modelling.
A single fusion encoder is used different multimodal tasks which allows it to learn from embeddings of multiple modalities.

12 of 16

X-attention

Introduction

We use a fusion encoder which use cross attention to learn from our multimodal embeddings.
We then use feed forward layers the embed our representations into a joint space.
Finally, we train the model on 2D and 3D descriptors, representing the Atom-Bond and Bond Angle graph respectively.

Atom-Bond graph embedding

Linear Layer

Fusion encoder

(Multi-modal)

Bond-Angle graph embedding

Combined Multi-

Modal embedding

2D property prediction

3D property prediction

13 of 16

Additionally, a contrastive loss function is used to push the Atom-Bond embedding and Bond-Angle embedding closer and all other embedding pairs farther

Downstream molecule

Pre-trained Inner blocks

Application

X-attention

Atom-Bond graph

Bond-Bond angle graph

Multi-modal

embedding

14 of 16

We selected a classification task and a regression task from the GEM paper.
For classification we used Clintox dataset: 1478 molecules. (Binary classification problem to predict the molecule’s toxicity)
For regression we used Freesolv: 642 molecules. (It contains the experimental hydration free energy of the molecules)

Step 6: Downstream predictions

Experiments

15 of 16

SSL result seems to distinguish the embedding space but further improvement could’ve lead to better downstream results.
X-attention was able to generate better embedding, but the model generalisability can be improved further.
Our X-attention model was trained on a limited set of molecules and hence it does not generalise well across different downstream datasets containing a diverse set of molecules.
Instead of implementing a 2-step approach, we can combine the SSL and X-attention method for generating the molecular embeddings.

And further improvements

Conclusion

16 of 16

Fang, Xiaomin, et al. "Geometry-enhanced molecular representation learning for property prediction." Nature Machine Intelligence 4.2 (2022): 127-134.
Oord, Aaron van den, Yazhe Li, and Oriol Vinyals. "Representation learning with contrastive predictive coding." arXiv preprint arXiv:1807.03748 (2018).
Self-Supervised Learning: Self-Prediction and Contrastive Learning | Tutorial | NeurIPS 2021 (YT)
Understanding Graph Neural Networks : DeepFindr (YT)
Stanford CS224W Graph ML Tutorials
Chang, J.; and Ye, J. C. 2023. Bidirectional Generation of Structure and Properties Through a Single Molecular Foundation Model.
Huh, J.; Park, S.; Lee, J. E.; and Ye, J. C. 2023. Improving Medical Speech-to-Text Accuracy with Vision-Language Pre-training Model. arXiv preprint arXiv:2303.00091.

Publications | Tutorials

References