Total Length of Genes = 50 Million Base Pairs
Human Genome
Full Length = 3 Billion Base Pairs
The flow of genetic information from DNA to mRNA to proteins
The identical genetic information is stored in DNA of all cells in our bodies. This requires precise regulation of gene activity so that only the correct set of genes is active in each specific cell type
Interphase
Cell cycle dynamics
Metaphase
Chromosome Territories
G1
M
S
G2
G0
Symmetric case
Interphase
Proliferation
Proliferation
Quiescence: Differentiation or Proliferation
Reprogramming
Symmetric
Symmetric
Asymmetric
Type of Result
Classification
A beautifully controlled process. The quiescent state subsequently leads to proliferation, differentiation, or senescence. improve efficiency of reprogramming by engineering symmetric cell division to a new cell type (red arrows).
Cell cycle
Proliferation
Quiescence
Differentiation
COMPONENTS
Cell cycle
Proliferation
Quiescence
Differentiation
A minimal network (switch) for the proliferation (MYOD1 strongly suppressed by CDK2), quiescence (CDK2 strongly suppressed by P21), or differentiation (P21 strongly activated by MYOD1) decision in myo-genesis.
MINIMAL SYSTEM
Two Papers:
Cell cycle
INACTIVE Gene
ACTIVE Gene
DNA
Heterochromatin
DNA is CLOSED and inaccessible
Euchromatin
DNA is OPEN and accessible
Genome Can Be Decomposed Into Two Parts
Data Background
Genome: Consists of “gene” and intergenic (“non-gene”) parts
Chromosome conformation capture (Hi-C)
RNASeq
“Non-gene” part
Time
Expression
Chromatin accessibility
Transcriptional activity
Chromosome “painting” to capture architecture
Chromosome conformation capture (Hi-C) data showing genome-wide chromatin contacts
Imaging
Biochemically
Sequence
Time
4DN
1D
High-dimensional Dynamical Data
Single cell
Single cell or Population
Basic Features of Genome Organization
From Tom Misteli at National Cancer Institute
Dynamical Data
Dynamical System
Time
Time
. . .
Structure
Function
Dynamical System!
Hi-C: Genome-wide chromosome conformation capture
1MB = 3000 X 3000, 50 BP = 62M X 62M
Hi-C: Graph = Matrix
Pore-C: Hypergraphs
Our Challenge: Convert genome architecture
into data and mathematics
Cell Cycle Dynamics: Single Cell Gene Expression During G1
Blue: Single Cells, Red: Mean Expression
Live Cell Imaging
Cells only
+MYOD1
- PRRX1
Normal Development
Differentiation
Embryonic Stem cell (ES cell)
Tissue Specific Stem cell (TSSC)
All cell types in the human body (estimated to be about 300)
Unidirectional process : Embryo into an adult human
Egg
Sperm
The human body is a finite and dynamical system
This is accomplished via autologous cell reprogramming
Cellular Reprogramming
Induced Pluripotent Stem Cell = Embryonic Stem Cell
Harold Weintraub: DIRECT Reprogramming
1989: (1945-1995)
1 Input = MyoD
Shinya Yamanaka: iPSC reprogramming (INDIRECT)
2006: Nobel Prize: 2012
4 Inputs = Oct4, Sox2, Klf4, MyC
Transcription Factors
Controlling Information
modified mRNA
RNA interference
Number of Genes in the Human Genome = 20000
Number of Transcription Factors = 1800
Number of Master Regulators (subset of Transcription Factors) = 800
For simplicity, let us consider three genes in the human genome, G1, G2, and G3. G1 mRNA is T1, and the protein product of G1 is P1. G2 produces mRNA T2 and protein P2, and likewise for G3, T3, and P3. P1 localizes to the nucleus and regulates transcription of G1 and G2. P2 also localizes to the nucleus, but only regulates transcription of G3. Thus, we defined G1 and G2 are transcription factors, whereas G3 is not. G1 is special in that it is a master regulator transcription factor, which can be defined as a self-regulating transcription factor. Figure 1 illustrates this hierarchy. If all genes are classified using this hierarchy, one can construct a universal gene network that is cell type invariant. We call this network the hardwired genome.
Hardwired Genome (HWG)
Master Regulator
Transcription Factors
p1
t1
p2
t2
p3
t3
g1
g2
g3
22
Hardwired Genome
The Hardwired Genome is a representation of all possible protein-protein interactions in the human genome, created jointly by iReprogram, Inc and University of Michigan. It is built on the Genome Reference Consortium Human Build 38 patch release 13 (GRCh38.p13)
�
Proteins | 19,216 |
Transcription Factors | 1,800 |
Master Regulators | 710 |
Interactions | 11,750,266 |
Elements of the Hardwired Genome
Multi-way Interactions in the Human Genome
My Ultimate Goal: My Cells, My Cure!
PROBLEM
SOLUTION
Autologous cell reprogramming
Bone-marrow Transplant is the Treatment for the Treatment