1 of 16

Decomprolute: a benchmarking platform for proteomic tumor deconvolution

Sara Gosline, PhD

Pacific Northwest National Laboratory

Seattle, WA, USA

2 of 16

Tumor deconvolution helps us understand the immune system’s role in cancer

  • When we measure the gene or protein expression of a tumor, we are actually capturing a lot more than just the tumor
  • Cancer cells send signals to immune cells to recruit them, trick them, and use them for their own purposes
  • Deconvolution tells us what immune cells are present in each tumor

Icons from the Noun Project

3 of 16

There are tools designed to deconvolve bulk mRNA expression into cell types

Newman et al, Robust enumeration of cell subsets from tissue expression profiles

Nature Methods, 2015

4 of 16

The ability to deconvolve tumors at the mRNA level has led to important insights about cancer immunology

Thorsson et al., The Immune Landscape of Cancer, Immunity, 2018

  • We can sort thousands of tumors, across cancers by immune subtypes
  • These subtypes define immune response and can alter prognosis of patient

5 of 16

But proteins could be more informative than mRNA

NCI Proteomic Data Commons

  • Most immune cells are identified by protein markers
  • Proteomics measurements exist across over 1000 patient samples with matched mRNA
  • But no gold standard exists

6 of 16

Decomprolute

  • Goal:
    • Build a benchmark that assesses the quality of deconvlution based on proteomic data
  • Requirements:
    • Must work on CPTAC data (largest matched mRNA/protein data known so far)
    • Must allow for additional cell type signatures (matrix files)
    • Must allow for facile addition of algorithms
    • Must implement distinct metrics that compare algorithms to each other

Feng et al, Decomprolute: A benchmarking platform designed for multiomics-based tumor deconvolution, BioRxiv 2023

7 of 16

Docker + CWL enable these requirements

  • Containerized API to access CPTAC generated data
  • Containerized signature matrices used to define cell types
  • Containerized numerous deconvolution algorithms
  • Constructed three distinct metrics to use in lieu of gold standard

Feng et al, Decomprolute: A benchmarking platform designed for multiomics-based tumor deconvolution, BioRxiv 2023

8 of 16

mRNA vs protein

cwltool https://raw.githubusercontent.com/PNNL-CompBio/decomprolute/main/metrics/mrna-prot/mrna-prot-comparison.cwl https://raw.githubusercontent.com/PNNL-CompBio/decomprolute/main/metrics/mrna-prot/alg-test.yml

9 of 16

Generates all vs. all comparison of mRNA and protein

  • Run each algorithm on mRNA and protein from matched CPTAC data
  • Evaluate mean correlation of each dataset across cell type
  • Compare which agree more

10 of 16

Data simulation comparison

cwltool https://raw.githubusercontent.com/PNNL-CompBio/decomprolute/main/metrics/data-sim/simul-data-comparison.cwl https://raw.githubusercontent.com/PNNL-CompBio/decomprolute/main/metrics/data-sim/prot-sim-test.yml�

11 of 16

Data simulation comparison

  • Enables side-by-side comparison of signature matrices
  • Which one best correlates with data simulated from same matrix?
  • How does this vary across cell types?

12 of 16

Immune subtype comparison

cwltool https://raw.githubusercontent.com/PNNL-CompBio/decomprolute/main/metrics/imm-subtypes/pan-can-immune-preds.cwl https://raw.githubusercontent.com/PNNL-CompBio/decomprolute/main/metrics/imm-subtypes/imm-args.yml �

13 of 16

Last test: comparison to known immune subtypes

  • Here we leverage the TCGA immune subtypes, based on mRNA
  • Compare the subtype for each patient to its immune composition
  • Goal is to identify algorithm that best stratifies subtypes by cell type (xCell)

14 of 16

Tumor deconvolution is only the beginning

  • Computational algorithm development in bioinformatics requires
    • Standardization of data
    • Containerization with common APIs
    • Agreed upon metrics
  • Through containerization and workflows we can improve algorithm development through benchmark datasets

Feng et al, Decomprolute: A benchmarking platform designed for multiomics-based tumor deconvolution, BioRxiv 2023

15 of 16

Acknowledgements

  • Song Feng
  • Anna Calinawan
  • Pietro Pugliese
  • Pei Wang
  • Michele Ceccarelli
  • Francesca Petralia
  • The CPTAC PanCan Immune Working Group

16 of 16

Can extend this test across all signature matrices, all cancer types

  • Leverages the extensive work on RNA-based deconvolution methods
  • Allows us to choose the ‘best’ algorithm for any proteogenomic dataset