Hardwired Genome (HWG)
HWG
For simplicity, let us consider three genes in the human genome, G1, G2, and G3. G1 mRNA is T1, and the protein product of G1 is P1. G2 produces mRNA T2 and protein P2, and likewise for G3, T3, and P3. P1 localizes to the nucleus and regulates transcription of G1 and G2. P2 also localizes to the nucleus, but only regulates transcription of G3. Thus, we defined G1 and G2 are transcription factors, whereas G3 is not. G1 is special in that it is a master regulator transcription factor, which can be defined as a self-regulating transcription factor. Figure 1 illustrates this hierarchy. If all genes are classified using this hierarchy, one can construct a universal gene network that is cell type invariant. We call this network the hardwired genome.
A
B
C
m
m
m
n
n
m = 18646,n = 1800
HWG(S)
S
…
A-Matrix: protein-protein interactions; B-Matrix: TF-DNA binding sites and activity coefficients; C-Matrix: TF-TF interactions; S-Matrix: Cell Type Specific Data (RNASeq, DNAseI/ATAC-Seq,…).
HWG
Indika Rajapakse, Steve Smale. Mathematics of the genome. Foundations of Computational Mathematics. 2017 Oct 1;17(5):1195-217.
HWG and Data
Data Source | Description |
Protein-protein interactions from 19,566 protein coding genes, derived from experimental data, computational predictions, and text mining | |
FANTOM5 | High resolution RNA-seq data (CAGE-seq) from 2,000 samples of over 200 cell types |
GTEx | Tissue specific RNA-seq of 54 non-diseased tissue sites from almost 1000 individuals |
1,600 transcription factors and their binding motifs | |
Roadmap Epigenomics | Chromatin accessibility through DNase-seq |
ENCODE | Chromatin accessibility through DNase-seq |
The Human Reference Interactome (HuRI) | 64,006 experimentally validated protein-protein interactions from 9,094 proteins |
4DNucleome Portal | Nucleomics data from almost 4,000 experiments covering over 1,500 experiment sets |
TRANSFAC | Manually curated database of 582 transcription factor binding sites, includes gene repression data |