4D Nucleome Hackathon
Project 3: In silico variant prioritization using sequence-based predictive models
March 21st, 2024
Seattle, WA
Shu, Katie, Julia, Yang, Leo
SuPreMo generates mutated sequences for ISM
Gjoni and Pollard, bioRxiv (2023)
Akita predicts 3D genome folding from DNA sequence alone
Fudenberg et al, Nature Methods (2020)
Our goals for the Hackathon
Implementing SuPreMo-Enformer
Testing SuPreMo-Enformer
Data: tumor SV called by Manta
Method: Enformer applied to investigate SVs’ influence on CTCF ChIP assay in HFF cells
*BND is forced to have a length of 30000 for the visualization purpose
Closer look of the representative case
González-Rico, Francisco J., et al. "Alu retrotransposons modulate Nanog expression through dynamic changes in regional chromatin conformation via aryl hydrocarbon receptor." Epigenetics & Chromatin 13 (2020): 1-13.
Implementing SuPreMo-ExPecto
{
"beluga_model_file": "/home/yang/Project/SuPreMo/ExPecto_model/resources/deepsea.beluga.pth",
"is_cuda": "False",
"model_list_file": "/home/yang/Project/SuPreMo/ExPecto_model/resources/modellist",
"gene_tss_file": "/home/yang/Project/SuPreMo/ExPecto_model/resources/geneanno.hg38.sorted.bed.gz",
"model_dir": "/home/yang/Project/SuPreMo/ExPecto_model",
"model_name": ["Fibroblast of Lung", "Small Intestine", "Small Intestine Terminal Ileum"],
"threads":16,
"n_features": 2002,
"fixed_dist": 100,
"maxshift": 800,
"split_flag": 1,
"split_index": 0,
"split_fold": 10,
"verbose_level": 0
}
Implementing SuPreMo-ExPecto - Cont’d
IRGM1 gene
Incorporating personalized genome into SuPreMo
Adding P-values for disruption scores
Calculating SuPreMo scores with functional weights
Weights examples
SuPreMo implementation
Regions of interest (--roi):
Weights (--roi_weights):
Weighted scores highlight variants that disrupt relevant regions
1.283e-6
1.282e-6
0.052
0.087
0.065
0.049
Disruption track
ATACseq weights
Sequence mutagenesis options with SuPreMo
Parameters: (1) % GC deletion (2) mutate flanking regions of the variant
Future parameters: (1) Apply to ALT/ALT-mutated sequences
SuPreMo-Akita scores
Variant (CTCF site) GC and nucleotide shuffling mutagenesis
TFBS mutagenesis options with SuPreMo
Parameters: (1) INV/shuffling of TFBS (2) mutate flanking regions of the variant
Future parameters: (1) DEL/DUP of TFBS (2) Apply to ALT/ALT-mutated sequences
SuPreMo-Akita scores
TFBS INV and shuffling mutagenesis
Conclusions
THANK YOU!
Acknowledgements
Katie Pollard (UCSF)
Peter J. Park (HMS)
Alexey I. Nesvizhskii (UMich)
Jian Ma (CMU)