Data Science and Visualization in Health
Networks and Graphs
André Santanchè
Laboratory of Information Systems – LIS
Institute of Computing – UNICAMP
August 2025
Graphs?
Graph - Mathematical Model
nodes
edges
d
a
b
c
Why Graphs?
Graphs and the Real-world
Challenge
Select one example of a real-world problem and represent it as a graph.
Define:
The Emergence of Network Maps
Political/Financial Networks
Mark Lombardi: tracked and mapped global financial fiascos in the 1980s and 1990s from public sources such as news articles.
Mark Lombardi Industries Carlos Cardoen of Santiago, Chile c. 1982-90 (2nd Version) 2000
Understanding Through Visualization
"[...] Department of Homeland Security came in to take a look. They said they found the work revelatory, not because the financial and political connections he mapped were new to them, but because Lombardi showed them an elegant way to array disparate information and make sense of things, which they thought might be useful to their security efforts.[...]"
Michael Kimmelman�Webs Connecting the Power Brokers, the Money and the World�NY Times November 14, 2003
Understanding Through Visualization
"I happened to be in the Drawing Center when the Lombardi show was being installed and several consultants to the Department of Homeland Security came in to take a look. They said they found the work revelatory, not because the financial and political connections he mapped were new to them, but because Lombardi showed them an elegant way to array disparate information and make sense of things, which they thought might be useful to their security efforts. I didn't know whether to nd that response comforting or alarming, but I saw exactly what they meant.”
Michael Kimmelman�Webs Connecting the Power Brokers, the Money and the World�NY Times November 14, 2003
How Wolves Change Rivers
→ Water → Beavers
→ Dams → Fish
Les Miserables
(Neo4j example)
Modeling the World as Graphs
Models
Models
Data Storage
Data Management
Application
Application
Application
Conceptual
Model
Logic Model
Physical
Model
Models
Application
Application
Application
Conceptual
Model
membro:
Doriana
livro:
Dinolândia
empréstimo
abstraction
fez
empréstimo
escreveu
Dinolândia
domain of discourse
universo de discurso
autor:
Horário
autoria
Models
Data Management
Application
Application
Application
Conceptual
Model
Logic Model
The Bridges of Königsberg
The Bridges of Königsberg
Task
Desenhe um caminho em que você seja capaz de cruzar todas as pontes, mas cada ponte apenas uma vez.
The Bridges of Königsberg
Bridges as a Graph
Bridges as a Graph
Conceptual
Model
Logic Model
Bridges as a Graph
Leonhard Euler in 1736
Simple
Principles
Complex
Systems
A Homogeneous Graph
The Eight Cities
Images generated by Gemini.
The Eight Cities
The Eight Cities as a Graph
The Eight Cities as a Graph
Challenge
Models
Data Storage
Data Management
Application
Application
Application
Conceptual
Model
Logic Model
Physical
Model
Mapping Logical to Physical
Logical
Conceptual
Graph
Stored as a Graph
mapping
Mapping Logical to Physical
Logical
Conceptual
Graph
Stored as a Table
mapping
Graph - Mathematical Model
nodes
edges
d
a
b
c
Graph - Mathematical Model
nodes - V
edges - E
d
a
b
c
Undirected Graph - Mathematical Model
nodes - V
edges - E
d
a
b
c
Directed Graph - Mathematical Model
nodes - V
edges - E
d
a
b
c
Undirected Graph - Adjacency Matrix
d
a
b
c
| a | b | c | d |
a | 0 | 1 | 1 | 1 |
b | 1 | 0 | 0 | 1 |
c | 1 | 0 | 0 | 1 |
d | 1 | 1 | 1 | 0 |
Directed Graph - Adjacency Matrix
d
a
b
c
| a | b | c | d |
a | 0 | 1 | 1 | 0 |
b | 0 | 0 | 0 | 0 |
c | 0 | 0 | 0 | 0 |
d | 1 | 1 | 1 | 0 |
Directed Graph - Edge List
d
a
b
c
source | target |
a | b |
a | c |
d | a |
d | b |
d | c |
The Eight Cities as a Graph
Challenge
Logical to Physical Model
Rot Donnadd
Pid Mught
Thulk Lebbimp
Bouvossam Damme
Pirg Zall
Logical to Physical Model
Rot Donnadd
Pid Mught
Thulk Lebbimp
Bouvossam Damme
Pirg Zall
Logical
Graph
Stored as a Graph
mapping
Logical to Physical Model
Nome |
Rot Donnadd |
Pid Mught |
Thulk Lebbimp |
Bouvossam Damme |
Pirg Zall |
Origem | Destino |
Rot Donnadd | Pid Mught |
Rot Donnadd | Thulk Lebbimp |
Pid Mught | Rot Donnadd |
Pid Mught | Pirg Zall |
Pid Mught | Bouvossam Damme |
Thulk Lebbimp | Rot Donnadd |
Thulk Lebbimp | Pirg Zall |
Bouvossam Damme | Pid Mught |
Bouvossam Damme | Pirg Zall |
Pirg Zall | Bouvossam Damme |
Pirg Zall | Pid Mught |
Pirg Zall | Thulk Lebbimp |
Logical
Graph
Stored as a Table
mapping
Cytoscape
Knowledge Graph
RAS
B-RAF
C-RAF
MEK
ERK
PLX4072
cell adhesion
cell-cell adhesion
cell-cell adhesion mediated by cadherin
cell-cell adhesion via plasma-membrane adhesion molecules
calcium-dependent cell-cell adhesion via plasma membrane cell adhesion molecules
cellular process
biological process
developmental process
cell morphogenesis
Phenomena Graph
Knowledge Graph
cell adhesion
cell-cell adhesion
cell-cell adhesion mediated by cadherin
cell-cell adhesion via plasma-membrane adhesion molecules
calcium-dependent cell-cell adhesion via plasma membrane cell adhesion molecules
cellular process
biological process
developmental process
cell morphogenesis
Machines talking to Machines
The Semantic Web�Machines talking to Machines
Berners-Lee, T., Hendler, J., & Lassila, O. (2001). The Semantic Web. Scientific American, 284(5), 28–37.
Case Study
Chronic Myeloid Leukemia (CML)
Philadelphia Chromosome
By Aryn89 - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=37195209
Myeloid Stem Cell
By Cancer Research UK - Original email from CRUK, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=34334439
CML in a 4 years old female
By J3D3 - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=47341095
Cell?
"Cell of Saint Teresa de Ávila in the Convent of Saint Joseph" (Wikipedia 2022)
"... is the process by which a cell uses its plasma membrane to engulf a large particle"
Cell?
"Cell of Saint Teresa de Ávila in the Convent of Saint Joseph" (Wikipedia, 2022)
"... is the process by which a cell uses its plasma membrane to engulf a large particle" (Wikipedia, 2022)
Explicit Semantics
Human Perspective
Machine Pattern Recognition
(e.g., by machine learning)
Semantic Web
Formal Knowledge
WordNet
Ontology
“An ontology is a formal, explicit specification of a shared conceptualisation.” (Studer et al., 1998)
Conceptualization
Formal, Explicit Specification
Ontology Spectrum
(Welty et al., 1999)
Starting from a Concept
Chronic Myeloid Leukemia (CML)
Connecting Concepts
Chronic Myeloid Leukemia (CML)
Leukemia
is a
Knowledge Graph
Knowledge Graph
Chronic Myeloid Leukemia (CML)
Leukemia
is a
Knowledge Graph
Chronic Myeloid Leukemia (CML)
Leukemia
is a
concept
relationship between concepts
Concept
nucleous
Related Concepts
nucleous
intracellular anatomical structure
part of
Knowledge Graph
nucleous
intracellular anatomical structure
part of
concept
relationship between concepts
Knowledge Graph
Tent of Miracles
Jorge Amado
author
Itabuna
bithPlace
DBPedia
recurso propriedade valor
Tent of Miracles author Jorge Amado
Jorge Amado birthPlace Itabuna
Knowledge Graph
(Ji et al., 2022)
Case Study
Human Symptom-Disease Network
Zhou, X., Menche, J., Barabási, A.-L., & Sharma, A. (2014). Human symptoms–disease network. Nature Communications, 5(1), 4212. https://doi.org/10.1038/ncomms5212
Clustered Regions - Same Disease Category
Downloading Datasets
Downloading Datasets
Diseases and Symptoms Table
disease | symptom | association |
Influenza | Fever | 136 |
Influenza | Headache | 17 |
Influenza | Fatigue | 5 |
Myocardial Infarction | Chest Pain | 1005 |
Myocardial Infarction | Fatigue | 74 |
Diabetes Complications | Obesity | 126 |
Diabetes Complications | Albuminuria | 45 |
Diabetes Complications | Acute Coronary Syndrome | 19 |
Diabetes Complications | Diarrhea | 7 |
Gene Ontology
mitochondrial DNA repair
GO:0043504
Ontologies
GO:0016055
Wnt signaling pathway
frizzled signaling pathway
Wnt signaling pathway
rdfs:label
obo:ExactSynonym
GO:0060071
Wnt signaling pathway, planar cell polarity pathway
taxon:9606
up:Q7Z3G6
up_core:organism
Prickle-like protein 2
rdfs:label
Homo sapiens
up_core:scientificName
rdfs:subClassOf
up_core:classifiedWith
Enrichment
Wang, Y., & Iha, H. (2023). The Novel Link between Gene Expression Profiles of Adult T-Cell Leukemia/Lymphoma Patients’ Peripheral Blood Lymphocytes and Ferroptosis Susceptibility. Genes, 14(11), 2005. https://doi.org/10.3390/GENES14112005
Chandak, P., Huang, K., & Zitnik, M. (2023). Building a knowledge graph to enable precision medicine. Scientific Data 2023 10:1, 10(1), 1–16. https://doi.org/10.1038/s41597-023-01960-3
Overview of PrimeKG
Downloading Datasets
Task
Given this knowledge graph (PrimeKG), what kind of inferences can I make with it?
Dado este grafo de conhecimento (PrimeKG). Que tipo de inferências eu posso fazer com ele?
RAS
B-RAF
C-RAF
MEK
ERK
PLX4072
Phenomena Graph
Protein-Protein Interaction
Protein
Protein
Interaction
RAS
B-RAF
C-RAF
MEK
ERK
PLX4072
Case Study - Melanoma
Tables as Networks
Table to Graph
Nodes
Graph
Table
id/prop1 | prop2 | prop3 |
value1 | value2 | value3 |
value1 | value2 | value3 |
value1 | value2 | value3 |
value1 | value2 | value3 |
value1 | value2 | value3 |
Graph
Table
id/prop1 | prop2 | prop3 |
value1 | value2 | value3 |
value1 | value2 | value3 |
value1 | value2 | value3 |
value1 | value2 | value3 |
value1 | value2 | value3 |
node
prop1: value1
prop2:value2
prop3: value3
node
prop1: value1
prop2:value2
prop3: value3
Graph
Table
#node | identifier | degree |
ABL1 | 9606.ENSP00000361423 | 20 |
AKT1 | 9606.ENSP00000451828 | 24 |
AKT2 | 9606.ENSP00000375892 | 20 |
AKT3 | 9606.ENSP00000500582 | 20 |
ARAF | 9606.ENSP00000290277 | 10 |
Graph
Table
#node name | identifier | degree |
ABL1 | 9606.ENSP00000361423 | 20 |
AKT1 | 9606.ENSP00000451828 | 24 |
… | … | … |
BRAF | 9606.ENSP00000419060 | 12 |
CBL | 9606.ENSP00000264033 | 17 |
:Protein
name: AKT1
identifier: 9606…28
:Protein
name: BRAF
identifier: 9606…60
Table to Graph
Edges
Graph
Table
origin | target | prop1 | prop2 | prop3 |
id1 | id2 | value1 | value2 | value3 |
id1 | id2 | value1 | value2 | value3 |
id1 | id2 | value1 | value2 | value3 |
id1 | id2 | value1 | value2 | value3 |
id1 | id2 | value1 | value2 | value3 |
node
node
edge
prop1: value1
prop2:value2
prop3: value3
Graph
Table
#node1 | node2 | experimentally�determined�interaction experimental | database�annotated database |
AKT1 | BRAF | 0.631 | 0 |
ATK1 | PIK3R1 | 0.558 | 0.9 |
… | … | … | … |
ARAF | MAPK1 | 0.11 | 0.5 |
ARAF | KRAS | 0.745 | 0.9 |
edge
experimental: 0.631
database: 0
:Protein
:Protein
name: AKT1
identifier: 9606…28
name: BRAF
identifier: 9606…60
Desnormalização
#node1 | node1.identifier | node2 | node2.identifier | experimentally�determined�interaction experimental | database�annotated database |
AKT1 | 9606.ENSP00000451828 | BRAF | 9606.ENSP00000419060 | 0.631 | 0 |
ATK1 | 9606.ENSP00000451828 | PIK3R1 | … | 0.558 | 0.9 |
… | … | … | … | … | … |
ARAF | … | MAPK1 | … | 0.11 | 0.5 |
ARAF | … | KRAS | … | 0.745 | 0.9 |
Database of known and predicted protein-protein interactions
Search for Melanoma
Selecting Dataset
Selecting Dataset
CML in KEGG
CML in STRING
Labels
Showing just Experimental Physical Interactions
CML in STRING (Experimental Physical Interactions)
Exporting
CytoScape
Melanoma - CytoScape
Task
Given this protein interaction graph, what kind of analyses can I perform with it?
Dado este grafo de interação entre proteínas. Que tipo de análises eu posso fazer com ele?
RAS
B-RAF
C-RAF
MEK
ERK
PLX4072
Descoberta
Roads Flow
Castelo, Campinas (OpenStreetMaps, 2015)
Food Web - FishBase
Freshwater food web: Neo Martinez and Richard Williams.
(Cavoto et al., 2015)
Emergence
Knowledge Graph
RAS
B-RAF
C-RAF
MEK
ERK
PLX4072
cell adhesion
cell-cell adhesion
cell-cell adhesion mediated by cadherin
cell-cell adhesion via plasma-membrane adhesion molecules
calcium-dependent cell-cell adhesion via plasma membrane cell adhesion molecules
cellular process
biological process
developmental process
cell morphogenesis
Phenomena Graph
Discovery
Inference
RAS
B-RAF
C-RAF
MEK
ERK
PLX4072
cell adhesion
cell-cell adhesion
cell-cell adhesion mediated by cadherin
cell-cell adhesion via plasma-membrane adhesion molecules
calcium-dependent cell-cell adhesion via plasma membrane cell adhesion molecules
cellular process
biological process
developmental process
cell morphogenesis
Our Approach
Discovery
Inference
References
Acknowledgements
André Santanchè
License and Acknowledgements