Benchmarking of Mutational Signature Assignment Tools
Xi (Sam) Wang, Dr. Marcos Díaz-Gay, Dr. Raviteja Vangara,
Dr. Ludmil B. Alexandrov
Somatic Mutations: changes in DNA
Figure from Wikipedia: somatic mutation
Somatic mutations are known to be generated by exposures
Alexandrov, L. B. et al. Nature 500, 415–21 (2013).
Patterns of mutations are linked to sources of DNA damage
Environmental exposures
Tobacco smoking or chewing
Failure in DNA replication or repair
Aberrant mismatch repair pathway
Normal cellular activities
Spontaneous deamination of methylated cytosines
Helleday, T., Eshtad, S. & Nik-Zainal, S. Nat. Rev. Genet. 15, 585–598 (2014).
Mutational Signature: defined by base substitutions and context
Six classes of single-base mutations
Reported by pyrimidine
Adding 5’ and 3’ adjacent bases
96 possibilities considering context
Tate, J. G. et al. Nucleic Acids Res. 47, D941–D947 (2018). | https://cancer.sanger.ac.uk/cosmic/signatures_v2
Signature Assignment
X = W x H
Mutation Matrix (given)
Signatures Matrix
(standard cosmic_v3)
Activities/Exposures Matrix
(WHAT WE WANT!)
Mutation Type | Sample 1 | Sample 2 | … |
A[C>A]A | | | |
A[C>A]C | | | |
… | | | |
T[C>G]T | | | |
Mutation Type | SBS1 | SBS2 | … |
A[C>A]A | | | |
A[C>A]C | | | |
… | | | |
T[C>G]T | | | |
Signatures | Sample 1 | Sample 2 | … |
SBS1 | | | |
SBS2 | | | |
… | | | |
SBS60 | | | |
=
X
In other words:
Mutational signature assignment is the process of finding the most contributing factors associated with an individual’s potential or existing cancer causes.
An accurate assignment means accurate discovery of causes of cancer!
Problem
There are no systematic benchmarks for all published mutational signature assignment tools.
🤷♀️🤷♂️
So, what do we do in this project: �Benchmark them all!
Current published signature assignment tools:
How do we benchmark?
Current Progress?
Qualitative & Quantitative
Qualitative Analysis Method
* Sample shown the scenario 2 without noise ground truth activities vs. activity results from tool 03_deconstructSigs
vs.
Ground Truth
Assignment from 03_deconstructSigs
Qualitative Analysis Result: all scenarios
Greater than 0
Cluster 1: QPSig, SigsPack, SignatureEstimation_QP
Cluster 2: MutationalPatterns, YAPSA (normal), MutationalCone
Qualitative Analysis Result: all scenarios
Greater than 0
Greater than 1% TMB
A step forward: Qualitative Analysis Method
Sum of Absolute differences by TMB
Quantitative Analysis Result: sum of absolute differences by TMB
Qualitative Analysis Result: average cos_sim
Conclusion
Future Impacts!
Thank you! Questions?