1 of 77

An Introduction To

Compositional Public Health

February 19th, 2025

NYC Category Theory Seminar

2 of 77

  • Native of Upstate New York
  • BS in Biomedical Engineering from Georgia Tech
  • MS in Applied Mathematics @ Northeastern
    • Trainee Affiliate @ Roux Institute
    • Cardiooncology; Health Equity; Innovation
  • Head of JuliaHealth & Julia GSoC
    • Assist ~15,000 people

I Am Jacob Zelko! Nice To Meet You!

3 of 77

  • Dual role at GTRI & CDC
    • GTRI: Health Emerging and Advanced Technologies Division
    • CDC: Office of Science

  • Projects
    • Various CDC COVID19 responses
    • PI and Grantee Overseer
      • Transnational collaborations
      • Chronic Mental Illness

GTRI & CDC (c. 2021 - 2023)

4 of 77

Working Definition of CPH

“Compositional public health is an emerging field at the intersection of category theory, public health, and engineering which utilizes tools from applied category theory for applications across public health.”

Compositional Public Health (CPH)

5 of 77

What Types of Questions?

  • Problems with strongly heterogeneous data
    • Exploring climate-related diseases like
      • Heat-related: myocardial infarction, stroke, etc.
      • Cold-related: SAD, musculoskeletal damage, etc.

  • Reuse of “knowledge discovery” pipelines
    • Imagine two processes (studies)
    • How could they compose? Simplify?

What Types of Questions?

6 of 77

Drug Development

Example Diagrams from Public Health

Causal Loop Diagrams

Stock-and-Flow Diagrams

7 of 77

Drug Development

Example Diagrams from Public Health [5]

8 of 77

Part 1: Motivation

  • Experiences
  • Public Health Work

Part 2: Background

  • CPH Examples
  • Applied Category Theory
  • ACT Tooling

Part 3: Demonstration

  • MS Overview
  • Categorical Data Science
  • Some Examples

Part 4: Conclusion

  • Open Problems
  • How to Get Involved
  • Next Steps

Structure of This Talk

9 of 77

Part 1: Motivation

10 of 77

Working Definition of CPH

“Compositional public health is an emerging field at the intersection of category theory, public health, and engineering which utilizes tools from applied category theory for applications across public health.”

Def: Compositional Public Health

11 of 77

Drug Development

  • Most projects worked
    • Rapid turnaround
    • Mostly data science-y/descriptive projects

  • Tried to capture what was happening at high level
      • Policy level effects
      • Disease developments
      • National and subgroup effects

CDC COVID Response Pipelines

12 of 77

Drug Development

  • Majority of project pipelines felt “the same” but…
    • Could be a slight change in
      • Datasets
      • Project requirements
      • Relationships

  • On a ship – patching it while sinking

  • Structures robust enough to handle pipeline changes

Frustrations during Response

13 of 77

Part 2: Background

14 of 77

Working Definition of CPH

“Compositional public health is an emerging field at the intersection of category theory, public health, and engineering which utilizes tools from applied category theory for applications across public health.”

Def: Compositional Public Health

15 of 77

Drug Development

  • ACT approaches leveraging public health expertise

  • Often needing to wear multiple hats

Explosion of CPH Literature

[9] Baez, J., Li, X., Libkind, S., Osgood, N. D., & Patterson, E. (2022). Compositional modeling with stock and flow diagrams. arXiv preprint arXiv:2205.08373.

[10] Aduddell, R., Fairbanks, J., Kumar, A., Ocal, P. S., Patterson, E., & Shapiro, B. T. (2024). A compositional account of motifs, mechanisms, and dynamics in biochemical regulatory networks. Compositionality, 6.

[11] Libkind, S., Baas, A., Halter, M., Patterson, E., & Fairbanks, J. P. (2022). An algebraic framework for structured epidemic modelling. Philosophical Transactions of the Royal Society A, 380(2233), 20210309.

… and much more!

16 of 77

Drug Development

  • Rapid development of pandemic models
  • Updating model components: costly
  • Framework to capture domain-specific knowledge
  • Bridge thought and code that implements them

Framework for Epidemic Modeling [11]

17 of 77

Drug Development

  • Widely used in epidemiology (population dynamics)
  • Compositionality via decorated cospans
  • Separates the syntax vs semantics
  • StockFlow.jl software

Composing Stock and Flow Diagrams [9]

18 of 77

Drug Development

  • CatColab is a structure editor for categorical structures:
    • Edited content is a structured object

  • Exists between text and graphical editors:
    • Provides mathematical syntax by construction
    • Provides generality across structures

CatColab – Collaborative Modeling [12]

19 of 77

IPUMS Demo

CatColab Demo

20 of 77

Drug Development

  • CT outside of maths

  • Current rise of ACT (2010’s)

Applied Category Theory (ACT)

21 of 77

Drug Development

  • Schematic for data stored in a database

  • Describes database components such as:
    • Tables – how data is laid out
    • Attributes – how tables relate to one another

  • Does not contain the data

What Are Database Schemas?

22 of 77

Drug Development

Example of Schema and Instances

Employees

Name

Employee ID

Jose Perez

1

Departments

Name

Department ID

Mathematics

1

Employees

Name

Employee ID

Departments

Name

Department ID

23 of 77

Drug Development

  • Understood as a finite category

  • Captures interactions:
    • Between rows of database instances
    • Based on columns of tables

Database Schemas as Categories

Employees

Departments

24 of 77

IPUMS Demo

Employees & Dept. Schema

25 of 77

IPUMS Demo

Employees & Dept. & Proj Schema

26 of 77

Drug Development

Employees & Dept. Schema

27 of 77

Drug Development

Employees & Dept. & Proj Schema

28 of 77

Drug Development

  • “Bringing Compositionality to Technical Computing” [6]

  • Open source research software written in Julia

  • Transforms ACT ideas to code

AlgebraicJulia

29 of 77

Drug Development

  • Several software packages exist

  • Spotlight: Catlab.jl and ACSets.jl

AlgebraicJulia

30 of 77

Drug Development

  • “Framework for applied [...] category theory”

  • “Programming [...] interface for applications of category theory”

  • Emphasizes monoidal categories; widely applicable

Catlab.jl

31 of 77

Drug Development

  • ACSets: Instances of database schemas

  • Contains data attributes

  • Enables computation within a categorical framework

Catlab & Attributed C-Sets [9]

32 of 77

Drug Development

Instances of Databases in ACSets.jl

Employees

Name

Employee ID

Emilia Guzman

2

Departments

Name

Department ID

Mathematics

1

33 of 77

Drug Development

  • Epi diagrams as categories
    • Stock-and-Flow: StockFlow

  • ACSets can present epi diagrams

  • Framework for computation

ACSets.jl & CPH Applications [9]

34 of 77

Drug Development

  • Mixes public health domain needs with ACT approaches

  • Bridging mathematicians and public health practitioners

  • Ideally, no need for “hybrids” – only hybrid approaches

Ultimate Goal of CPH

“Compositional public health is an emerging field at the intersection of category theory, public health, and engineering which utilizes tools from applied category theory for applications across public health.”

35 of 77

Part 3: Demonstrations

36 of 77

MS Thesis Goals

  1. Explore categorical data science techniques in knowledge discovery involving heterogeneous data

  • Prototype realistic public health ACT study

  • Push boundaries of ACT and what research questions can be answered using SOTA tools

MS Thesis Goals

37 of 77

Drug Development

  • Using categorical computational structures for data science

  • More about integration of data assets

  • Provides a place “to do” data science*
    • *to some extent

Categorical Data Science

38 of 77

Approach

  • Dataset Selection (Done)

  • ACSet-ification of Data (Almost Done)

  • Exploring results (Started)

Approach

39 of 77

  1. Dataset Selections

40 of 77

Census and Survey Microdata

  • IPUMS Data Services

  • Internationally available and free
    • IPUMS.jl exists!

  • Used Current Population Survey data

Census and Survey Microdata

41 of 77

IPUMS Demo

IPUMS Demo

42 of 77

Weather and Climate Data

  • National Centers for Environmental Information

  • Nationally available, free, and COMPREHENSIVE!
    • NCEI.jl exists!

  • Used NOAA Monthly U.S. Climate Divisional Database (NClimDiv)

Weather and Climate Data

43 of 77

IPUMS Demo

NCEI Demo (Skipped)

44 of 77

Patient Data

  • Synthea OMOP CDM Patient Medical Claims

  • Nationally available, free, and synthetic
    • Many OMOP Julia tools exist

  • Used SQLite ~2500 Patient DB

Patient Data

45 of 77

IPUMS Demo

SYNTHEA Demo

46 of 77

Remarks

  • Data selected intentionally to be strongly heterogeneous

  • Each data set represents its own problem space

  • Mocking of true scenarios I am interested in

General Remarks about Datasets

47 of 77

IPUMS Demo

Research Demo Draft

48 of 77

IPUMS Demo

2. ACSet-ification of Data

49 of 77

IPUMS Demo

ACSet-ification Demo

50 of 77

IPUMS Demo

Conjunctive Queries on Junction Tables

Credit to Dr. Sean Wu and Matt Cuffaro here!

51 of 77

Remarks

  • ACSet-ification is non-trivial

  • ACSet-ification requires domain expertise

  • Why ACSet-ify?

Reflections on ACSet-ification

52 of 77

Remarks

  • Ripe for exploration and innovation

  • Can play a role in simplification and improving pipelines

  • Problem domain space continues to grow
    • Applicability within traditional data science workflows
    • Larger than memory data handling

Thoughts on Categorical Data Science

53 of 77

Part 4: Conclusion

54 of 77

Working Definition of CPH

“Compositional public health is an emerging field at the intersection of category theory, public health, and engineering which utilizes tools from applied category theory for applications across public health.”

Compositional Public Health (CPH)

55 of 77

Papers and Open Questions

  • What do we need to enable multimodal health informatics?
    • Sheaves Are the Canonical Data Structure for Sensor Integration by Robinson

  • How can we use spatio-temporal sheaves to model epidemics?
    • Towards a Unified Theory of Time-Varying Data by Bumpus, et. al.

  • What would a sheaf-based query language look like?
    • Algebraic Databases by Schultz, et. al.

  • How do you model causal processes in ACT?
    • Causal Theories: A Categorical Perspective on Bayesian Networks by Brendan Fong

Papers and Open Questions in CPH

56 of 77

How To Get Involved

Join CatColab!

Join AlgebraicJulia!

Get Involved!

57 of 77

Resources

  • ACT4ED @ MIT (Textbook, course)

  • CMPT856 Categorical mathematics for compositional modeling @ Uni. Saskatchewan (YouTube, course)

  • AlgebraicJulia & Topos Inst. Blog

Some Resources

58 of 77

Thoughts

  • Applying Applied Category Theory is hard - Sean Wu

  • Compositional Public Health is just one application

  • Wish list study areas as an abstractionist:
    • Double Category Theory
    • Sheaf Theory
    • Topos Theory

Final Thoughts

59 of 77

Next Steps

  • Write paper on CPH with Prof. Nathaniel Osgood

  • Coordinate CPH workshop in Canada
    • Banff International Research Station

  • Operationalize AlgebraicJulia further

Next Steps

60 of 77

Next Steps

  • ACT 2025 proposal submission

  • Finish MS thesis and pursue PhD exploring potential

  • Explore collaborations, mentoring, and contributions

Next Steps

61 of 77

Papers and Open Questions

  • Prof. Nathaniel Osgood
  • Matt Cuffaro
  • Dr. Sean Lawrence Wu
  • Lea Luchterhand
  • Dr. Evan Patterson
  • Prof. James Fairbanks
  • Marius Furter
  • Yujun Huang
  • Prof. Emilio Minichiello
  • Dr. Kris Brown
  • Dr. Sophie Libkind
  • Owen Lynch
  • Megan Denham
  • Jon Duke, MD
  • Prof. Justin Manjourides
  • Dr. Kris Brown
  • Dr. Kevin Carlson
  • Prof. Noson Yanofsky

Acknowledgements

62 of 77

Questions?

“‘Your system is just a component in another person’s system’

- Gioele Zardini’”

- Jacob S. Zelko

Questions?

63 of 77

“‘Your system is just a component in another person’s system’

- Gioele Zardini’”

- Jacob S. Zelko

64 of 77

References

[1] Gatseva, P. D., & Argirova, M. (2011). Public health: the science of promoting health. Journal of Public Health, 19, 205-206.

[2] Leinster, T. (2014). Basic category theory (Vol. 143). Cambridge University Press.

[3] https://ncatlab.org/nlab/show/HomePage

[4] Schultz, P., Spivak, D. I., Vasilakopoulou, C., & Wisnesky, R. (2016). Algebraic databases. arXiv preprint arXiv:1602.03501.

[5] Patterson, E. (2022) AlgebraicJulia: a compositional approach to technical computing

[6] Lynch O., Fairbanks, J. (2023) Computational Category Theory in Applied Mathematics

[7] Patterson, E. (2020) Graphs and C-sets I: What is a graph?

[8] Patterson, E., Lynch, O., & Fairbanks, J. (2022). Categorical data structures for technical computing. Compositionality, 4.

[12] CatColab, Patterson, E. (2024). Toward collaborative modeling with categorical logics. Topos Institute Berkeley Seminar

References

65 of 77

Supporting Slides

66 of 77

Category Theory

67 of 77

Drug Development

General Idea:

“Category theory takes a bird’s eye view of mathematics. From high in the sky, details become invisible, but we can spot patterns that were impossible to detect from ground level.” - Tom Leinster, Basic Category Theory [2]

“How things relate to things”

Quick Summary of Category Theory

68 of 77

Drug Development

  • Fundamental data structure is a category equipped with:
    • Collection of Objects
    • Morphisms (Arrows):
      • f: A → B
    • Notation: C

Basics of Categories – Definition

69 of 77

Drug Development

  • Composition: g ◦ f : A → C

  • Associativity: h ◦ (g ◦ f ) = (h ◦ g) ◦ f

  • Unit: f ◦ 1A = f = 1B ◦ f

  • Identity: 1A : A → A

Basics of Categories – Properties

  • f : A → B
  • g : B → C
  • h : C → D

70 of 77

Drug Development

  • Functor: mapping of objects to objects, arrows to arrows
    • F: CD

  • Properties:
    • F(f : A → B) = F (f) : F (A) → F (B)
    • F(g ◦ f) = F(g) ◦ F(f)
    • F(1A) = 1F(A)

Basics of Categories – Functors

71 of 77

ACSets Details

72 of 77

Drug Development

  • Given a small category C, a C-set is a functor X: CSet

  • Given a C-set, X, it consists of:
    • A set X(c) for all c, d in C
    • A function X(f): X(c) → X(d) for all morphisms in C, f: c → d

C-Sets [7]

73 of 77

Special Categories

74 of 77

Category of StockFlow

75 of 77

Category of Whole-Grain Petri Nets

76 of 77

Drug Development

  • Given a small category C, a C-set is a functor X: CSet

  • Given a C-set, X, it consists of:
    • A set X(c) for all c, d in C
    • A function X(f): X(c) → X(d) for all morphisms in C, f: c → d

Category: StockFlow

77 of 77

Drug Development

  • Given a small category C, a C-set is a functor X: CSet

  • Given a C-set, X, it consists of:
    • A set X(c) for all c, d in C
    • A function X(f): X(c) → X(d) for all morphisms in C, f: c → d

Category: Whole-Grain Petri Nets