Validation - STRING database [18]
experimentally determined
co-expression
protein homology
text mining
However, how stable is the consensus? We proceed with a 5-fold subsampling setting to create a stable consensus network.
In this study, we explore how the use of multiple network-based methods (HotNet2 [1], SigMod [2], Heinz [3], dmGWAS [4]) on many samples (at the scale of a biobank) and variants (SNPs) can enhance gene identification, and gene sets overrepresentation in post-Genome-Wide Association Study (GWAS) analyses.
Our research addresses the inherent challenges arising from the high dimensionality of GWAS datasets to ultimately improve gene discovery and our understanding of the biological mechanisms underlying numerous polygenic diseases like psoriasis.
Introduction/Objectives
Methods
The UK Biobank (UKB) is a large-scale, population-based study that collected detailed information from approximately half a million participants in the United Kingdom. The data collected includes biological samples as well as comprehensive health-related and demographic measures [5]. For this work, we used patients affected with psoriasis as our cases, and unaffected patients as controls. We used samples from SNP array data.
Data
Results
Conclusions
488 377�patients
Withdrawal + missing data
488 159
White British Ancestry
409 522
183 503
Controls
10 105
Cases
Scotland limited access to PC data
193 608
784 256�SNPs
QC
- Minor allele frequency < 0.01
- Minor allele count < 100
- Genotype missingness > 0.1
- Hardy-Weinberg equilibrium < 1e-15
- Sample missingness > 0.1
= 593 068 SNPs
Psoriasis
Plaque
Nail
Guttate
Inverse
Pustular
Erythrodermic
SNP-array
Samples and phenotype
(.fam)
Individual genotype
(.bed)
Variant calling
(.bim)
Quality control�(PLINK v1.9)
Association analysis�SNP p-values
(PLINK v1.9)
PPI network
(BioGRID)
Annotation
and
Gene analysis steps
(MAGMA v1.10)
Heinz
HotNet2
SigMod
Consensus network: genes in at least 3 methods
dmGWAS
SNP-array (593 068 SNPs)
50 kbp: 401 659 SNPs (67.73%) mapped to at least one gene.
Fileset 1
Association analysis�SNP p-values
(PLINK v1.9)
Annotation
and
Gene analysis steps
(MAGMA v1.10)
Stable Consensus network: genes in 12 solutions
Fileset 2
Fileset 3
Fileset 4
Fileset 5
1
2
3
4
5
1
2
3
4
1
5
2
3
4
5
1
4
2
3
5
A fileset has .bed,.fam and .bim files
1H
2H
3H
4H
5H
1N
2N
3N
4N
5N
1S
2S
3S
4S
5S
PPI network
(BioGRID)
1D
2D
3D
5D
4D
Fig.1
Fig.2
■ Psoriasis genes ■ Ran2019 [6]
■ Tsoi2017 [7] ■ MitoRibo genes
■ IL17 pathway ■ IL23 pathway ■ TNF pathway
The stable consensus network
selected 92 genes with 5 subnetworks or modules which were enriched in different biological pathways that are implicated in psoriasis. For example,
well-known TNF/IL-23/IL-17 signaling pathways (fig. 2) [8], but also Human Leukocyte genes genes like HLA-A [9] presumably HLA-G [10], and CDK2 (cyclin‐dependent kinases) [11].
GSEA for subnetworks or modules (Enrichr)
Moreover, mitochondrial ribosomal genes module (fig. 1) are enriched in Mitochondrial Translation (Initiation, Elongation, Termination), and Translation pathways which might be linked with Psoriatic arthritis [12]. Our approach also identified PSMA6 [13], PSMB8 [14], PSMB9 [15], HSPA1A [16] and PA2G4 [17] as being
associated with psoriasis.
In this study, we applied four network-based methods to SNP array data from the UK Biobank, focusing on the autoimmune disease psoriasis. Our approach was able to identify known biological pathways associated with the disease, as well as genes more recently discovered to be interrelated with psoriasis, such as PSMA6, PSMB8, HSPA1A and PSMB9. We addressed data heterogeneity and the instability of multiple solutions by choosing a stable consensus network after data subsampling; this framework could be suitable for studying other complex traits. Finally, the choice of PPI network could affect the output due to variations in interactions across databases, suggesting a need for further evaluation in future work.
[1] LM. et al. Nature Genetics 47 (2015). 10.1038/ng.3168 [13] JB. et al. Life (Basel) 11 (2021). 10.3390/life11090887
[2] YL. et al. Bioinformatics 33 (2017)). 10.1093/bioinformatics/btx004 [12] AA. et al. Annals of the Rheumatic Diseases (2019). �[3] DM. et al. Bioinformatics 24 (2008). 10.1093/bioinformatics/btn161 10.1136/annrheumdis-2018-214158
[4] QW. et al. Bioinformatics 31 (2015). 10.1093/bioinformatics/btv150 [14] AT. et al. International Journal of Molecular Sciences 25 (2024).
[5] CB. et al. Nature 562 (2018). 10.1038/s41586-018-0579-z 10.3390/ijms25179192
[6] DR. et al. Precision Clinical Medicine 2 (2019). 10.1093/pcmedi/pbz011 [15] MZ. et al. Journal of Translational Medicine 23 (2025).
[7] TL. et al. Nature Communications 8 (2017). 10.1038/ncomms15382 10.1186/s12967-024-06015-8
[8] GJ. et al. Signal Transduction and Targeted Therapy 8 (2023). 10.1038/s41392-023-01655-6 [16] LS. et al. Briefings in bioinformatics 26 (2024). 10.1093/bib/bbaf032
[9] CF. et al. Psoriasis (Auckland, N.Z.) 11 (2021).10.2147/PTT.S258050 [17] TR. et al. Journal of Investigative Dermatology 144 (2024).
[10] MZ. et al. International Journal of Molecular Sciences 22 (2021). 10.3390/ijms222413348 [18] SD. et al. Nucleic acids research 51 (2023).
[11] PH. et al. British Journal of Dermatology 182 (2020). 10.1111/bjd.18178 �
Psoriasis: A Case Study on Using Biological Networks for Gene Discovery
Giann Karlo Aguirre-Samboní★, Gwenaëlle Lemoine★, Julio Molineros ☆, Florian Massip★ and Chloé-Agathe Azencott★
★Centre for Computational Biology, Mines Paris, PSL University, Paris 75006, France
★Department U1331, Institut Curie, PSL University, Paris 75005, France
★INSERM, U1331, Paris 75005, France
☆Johnson and Johnson Innovative Medicine, 1400 McKean Rd, Spring House, PA 19002, USA
Affiliations
References
from curated
databases
gene neighborhood
gene fusions
gene
co-occurrence