The Physics of Transcriptomes
MCB137L/237L
Spring 2025
What Does the mRNA Distribution Tell Us About How Transcription Happens?
Zenklusen et al. (2008)
Testing the Null Hypothesis�Deviations from Poisson Reveal Molecular Mechanism
Zenklusen et al. (2008)
And, Yet, Our Ignorance is Vast
Regulatory Ignorance Throughout Our�Model Organisms
The Cell as a Bag of RNA
Technologies Driving the DNA Sequencing Revolution
From Gene-Wide to�Genome-Wide Studies
A Deluge of Sequencing Data
A Feeling for the Scale of Our Sequencing Data
Estimating Book Lengths
Our Collective Memory:�The Library of Congress
Number of Letters and Their Meaning
The Sequence Read Archive Versus Shakespeare
Fidelity in biological polymerization: Key question, are we surprised?
The Insufficiency of Equilibrium Molecular Recognition
A Toy Model of Translation
The Kinetic Proofreading Idea: Energy to Fuel Error Correction
It’s Not Just About Information Amount, It’s Also About Information Density
Predictive Understanding of Cellular Decision Making Through the Theory-Experiment Dialogue
Precision Measurements to Fuel the Theory-Experiment Dialogue:�Measuring Protein Expression
Precision Measurements to Fuel the Theory-Experiment Dialogue:�Measuring mRNA Expression
Perrin’s Take on Precision Measurements and Reproducibility
The Meaning of Precision Measurements
Demanding Quantitative Agreement Between Measurements:�The Example of Mass Spectrometry
Demanding Quantitative Agreement Between Measurements:�Flow Cytometry Vs. Microscopy
Demanding Quantitative Agreement:�smFISH vs. Enzymatic Assays vs. Microscopy
Querying the Transcriptome at the Single Cell Level
Querying the Transcriptome at the Single Cell Level
The Single-Cell Sequencing Revolution
The Tabula muris project
Assume That We Have a Constitutive Promoter
The mRNA Distribution in Space and Time
The Poisson Distribution Is Fully Determined by One Parameter
A Physical Model of the Single-Cell Sequencing Process
A Dishonest Coin Flip Decides Whether Each mRNA Will Be Sequenced
The Statistics of Coin Flips
The Statistics of Coin Flips
The Order of Coin Flips Doesn’t Matter
The Statistics of Coin Flips
The Binomial Distribution�One of the Great Probability Distributions
Add Savage Rosenfeld and other explanations of the Binomial distribution
What Happens With the mRNA Molecules That Were Not Captured?
A Measure of Our Precision: The Debate over Zero Inflation
A Challenge to Quantitative Single-Cell RNA Sequencing:�Zero Inflation and Dropout Probability
SANITY: Assigning Error Bars to scRNA-Seq Data
Querying the Transcriptome
Querying the Transcriptome at the Single Cell Level
The Single-Cell Sequencing Revolution
The Tabula muris project
Cellular Decisions Are Often Driven by a Handful of Genes
Our Toy Model: 2D Synthetic Transcriptome
Identifying Cell Types in a 2D Synthetic Transcriptome
Finding the Right Coordinate System to Describe our Data
Finding the Right Coordinate System to Describe our Data
Key Idea: Finding the “Right” Coordinates
Key Idea: Finding the “Right” Coordinates
A Toy Model From Mechanics of the Key Idea: Finding the “Right” Coordinates For Two Coupled Oscillators
(Berman et al.)
Solving the Coupled Oscillators in a Bad Coordinate System
(Berman et al.)
Plotting The Two Coordinates Together Reveals Structure
Finding the “Right” Coordinates
(Berman et al.)
The “Right” Coordinates Reveal the Natural Variables of the System
Several Ways of Looking at the Problem: One from Mechanics, One as Data
(Berman et al.)
The Covariance Matrix of Our Rotated Data
Eigenvectors and Eigenvalues
The “Right” Coordinates Reveal the Natural Variables of the System
The Eigenvectors of the Covariance Matrix Define the Normal Modes (or Principal Components)
Your Turn: A Synthetic Transcriptome Made of Two Constitutive Promoters
Your Turn: Creating a Synthetic Transcriptome
Your Turn: Creating a Synthetic Transcriptome
Your Turn: Finding the Right Coordinate System to Describe our Data
Projecting Data Using the Dot Product
Finding the Right Coordinate System to Describe our Data
The Eigenvalues Report on the Spread of the Data Along Each Dimension
The Error in a Reduced Description of our System
The 3D Synthetic Transcriptome
Finding the Natural Coordinates of Our Synthetic Transcriptome
Dimensionality Reduction�Most of the Relevant Information Lives on a Plane
The Error in a Reduced Description of our System
Homework�Adding Downstream Genes
Homework�Adding Noisy, Uncorrelated Genes
A More Common Definition of PCA
Quantifying C. elegans movement and shape
The Eigenworm!
A Simple Synthetic Transcriptome
Finding Cell Types in the Transcriptome Using k-means Clustering