1 of 51

Study Design and Analysis in Epidemiology: Where does modelling fit?

MMED 2025

Carl

Earlier contributors: Cari van Schalkwyk & Steve Bellan

Slide Set Citation: DOI:10.6084/m9.figshare.5038784

The ICI3D Figshare Collection

000

2 of 51

Goals

  • Define Epidemiology
  • Discuss metrics of epidemiology
    • Measures of disease
    • Measures of effect
  • Describe basic study designs used in epidemiology
  • Discuss error, bias and confounding
  • Compare and contrast statistical models and dynamic models

2

3 of 51

Defining Epidemiology

“The study of the distribution and determinants of health-related states and events in populations, and the application of this study to control health problems.”

John M Last �Dictionary of Epidemiology

3

4 of 51

  • Risk Factors & Intervention Epidemiology

4

Risk Factor: A characteristic that is correlated with a measure of disease.�

  • Often used synonymously with covariate.�
  • Protective factors: Risk factors that are negatively associated with disease

Varieties of Infectious Disease Epidemiology

5 of 51

Varieties of Infectious Disease Epidemiology

  • Risk Factors & Intervention �
  • Outbreak

5

  • Clinical �
  • Surveillance

6 of 51

How does mathematical modeling fit?

  • Subfield: � Linking pattern with process across scales

6

  • Methodologies used in other epi subfields

Importance of knowledge breadth

7 of 51

Goals

  • Define Epidemiology
  • Discuss metrics of epidemiology
    • Measures of disease
    • Measures of effect
  • Describe basic study designs used in epidemiology
  • Discuss error, bias and confounding
  • Compare and contrast statistical models and dynamic models

7

8 of 51

Measures of Disease

  • Prevalence:
    • Proportion of the population with infection or disease at a given time point

  • Incidence:
    • The number of new cases that occur within a given population and period of time
    • Sometimes stated as a raw number of cases per time window, such as 1000 cases of influenza in a month
    • For comparison of incidence between populations (or within a population that is changing in size), incidence should be standardized (e.g., on a per capita basis or per 100,000 population per time period)

  • Survival �(time to event, e.g., death)

8

9 of 51

Measures of Covariates (risk factors)

  • Binary: smoker, circumcised��
  • Nominal/Categorical: geographic region��
  • Continuous: birth weight, T-cell count��
  • Ordinal: education, socioeconomic status (SES)�

9

10 of 51

Measures of Effect

  • How do you measure the effect of a risk factor on a disease?

Example

How could you measure whether circumcision influences the risk of HIV infection?�

10

11 of 51

Measures of Effect

11

    • Compare measure of disease across levels/values of risk factors �
    • Relative Risk - magnitude of association between risk factor & disease

    • Ratio of rates or proportions
        • Prevalence Ratio
        • Incidence rate Ratio
        • Odds Ratio

    • Attributable risk – public health impact of risk factor on disease

12 of 51

Goals

  • Define Epidemiology
  • Discuss metrics of epidemiology
    • Measures of disease
    • Measures of effect
  • Describe basic study designs used in epidemiology
  • Discuss error, bias and confounding
  • Compare and contrast statistical models and dynamic models

12

13 of 51

Epidemiological Studies

  • Descriptive Epidemiology
    • Baseline data on distribution of disease
    • Surveillance�
  • Analytic Epidemiology – Measure Effect
    • Ecological Studies
    • Cross-sectional Studies
    • Retrospective Case-Control Studies
    • Prospective Cohort Studies
    • Randomized Controlled Trials

  • Mathematical epidemiology
    • Mechanistically understand how the effects of risk factors lead to the distribution of disease

13

Observational

Experimental

Most studies are both

14 of 51

Ecological Studies

  • Measurements made at population rather than individual level.�
  • Weaker inference, but easier to gather data.

14

a.k.a

Correlational studies

15 of 51

15

Bousquet et al (2021) Cabbage and fermented vegetables: From death rate heterogeneity in countries to candidates for mitigation strategies of severe COVID-19. Allergy DOI: 10.1111/all.14549

16 of 51

Example

How could you measure whether circumcision influences the risk of HIV infection?�

16

17 of 51

Ecological study:�HIV prevalence (UNAIDS) vs�circumcision prevalence (DHS)

17

Williams et al. The Potential Impact of Male Circumcision on HIV in Sub-Saharan Africa. Plos Medicine 2006

Circumcision prevalence

HIV prevalence

18 of 51

Cross-Sectional Studies

  • Snapshot of diseases & risk factors.�
  • Cannot establish temporal relationship.�
  • Relatively cheap & easy.�
  • Population must be large to study rare disease.�
  • Not great for diseases of short duration. Why?

18

a.k.a

Surveys

Prevalence studies

19 of 51

Relative Risk: Prevalence Ratio

19

Disease

No Disease

Total (Margins)

Exposed

a

b

a+b

Not exposed

c

d

c+d

Total (Margins)

a+c

b+d

a+b+c+d

Prevalence Ratio (PR):�prevalence in exposed population divided by prevalence in unexposed population.

PR < 1 exposure correlates with reduced risk of disease

PR > 1 exposure correlates with increased risk of disease

 

Contingency table or 2x2 table:

20 of 51

Example

How could you measure whether circumcision influences the risk of HIV infection?�

20

21 of 51

Cross-sectional study:�HIV prevalence (DHS) vs�circumcision prevalence (DHS)�����SA 2016 DHS:

21

Njeuhmeli et al. Voluntary Medical Male Circumcision: Modeling the Impact and Cost of Expanding Male Circumcision for HIV Prevention in Eastern and Southern Africa. Plos Medicine 2011

HIV+

HIV-

Total

Circumcised

143

1121

1264

Not Circumcised

172

760

932

Total

315

1881

2196

 

22 of 51

Case-Control Studies

  • Compare diseased individuals to chosen controls.
    • Quality of study depends entirely on how controls are chosen.

  • Good for rare diseases.

  • Relatively cheap & quick.

22

23 of 51

Case Control Studies: Odds Ratios

23

Disease

No Disease

Total (Margins)

Exposed

a

b

a+b

Not exposed

c

d

c+d

Total (Margins)

a+c

b+d

a+b+c+d

Odds ratio is the ratio of odds in the diseased population divided by the odds in the �non-diseased population.

OR < 1 means exposure correlates with reduced risk of disease

OR > 1 means exposure correlates with increased risk of disease

Controls: Number chosen by researcher.

 

 

24 of 51

Example

How could you measure whether circumcision influences the risk of HIV infection?�

24

25 of 51

Case-control study:�

  • Sassan-Morokro et al, High rates of sexual contact with female sex workers, sexually transmitted diseases, and condom neglect among HIV-infected and uninfected men with tuberculosis in Abidjan, Cote d'Ivoire, JAIDS, 1996

25

HIV+

HIV-

Total

Circumcised

15

19

34

Not Circumcised

475

220

695

Total

490

239

729

 

HIV+

HIV-

Total

Circumcised

48

121

169

Not Circumcised

101

272

373

Total

149

393

542

 

  • Quigley et al. Sexual behaviour patterns and other risk factors for HIV infection in rural Tanzania: a case–control study, AIDS, 1997

26 of 51

Cohort Studies

  • Follow a selected population through time
    • Establishes temporal relationships
    • Can measure incidence�
  • Takes lots of resources, money, & time!�
  • Poor design for rare diseases.

26

a.k.a

Prospective studies

Longitudinal studies

Follow-up studies

27 of 51

27

28 of 51

Relative Risk: Cumulative Incidence Ratio

28

Disease

No Disease

Total (Margins)

Exposed

a

b

a+b

Not exposed

c

d

c+d

Total (Margins)

a+c

b+d

a+b+c+d

Cumulative Incidence Ratio (CIR):�cumulative incidence in exposed population divided by cumulative incidence in unexposed population.

CIR < 1 exposure correlates with reduced risk of disease

CIR > 1 exposure correlates with increased risk of disease

 

 

29 of 51

Cohort Data and Person-Time

X marks occurrence of disease X

O Marks death

Cumulative Incidence

Total number of person-time

Immunity

Time joining the study

Death

5 / 12 = 0.42

30 of 51

Cohort Data and Person-Time

X marks occurrence of disease X

O Marks death

Incidence rate

Number of disease occurrence

= 5

Total number of person-time (X confers immunity)

= 26 years

Incidence rate=5/26=0.19 per person year

31 of 51

Relative Risk: Incidence Rate Ratios

31

Disease

No Disease

Total (Margins)

Exposed

a

-

PYe

Not exposed

c

-

PY0

Total (Margins)

a+c

-

PYe + PY0

Incidence Rate Ratio is the ratio of the incidence rate of the exposed population to that of the unexposed population.

IRR < 1 means exposure correlates with reduced risk of disease

IRR > 1 means exposure correlates with increased risk of disease

 

32 of 51

Example

How could you measure whether circumcision influences the risk of HIV infection?�

32

33 of 51

Cohort study:�

33

Cameron et al. Female to male transmission of human immunodeficiency virus type 1: risk factors for seroconversion in men. The Lancet 1989

HIV+

HIV-

Total Pwks

Circumcised

6

-

2966

Not circumcised

18

-

1016

Total (Margins)

24

-

3982

 

  • Nairobi, Kenya 1987
  • 293 clients of female sex workers (73% circumcised)
  • Mean follow-up of 14 weeks

34 of 51

Randomised Controlled Trials

  • Experimental or Intervention Studies�
  • Establishes temporal relationships�
  • Addresses confounding (more to come)

  • Causal evidence

  • Another in depth lecture on this follows!

34

35 of 51

Example

How could you measure whether circumcision influences the risk of HIV infection?�

35

36 of 51

Randomised Controlled Trial:�

36

Auvert et al. Randomized, Controlled Intervention Trial

of Male Circumcision for Reduction of HIV

Infection Risk: The ANRS 1265 Trial. Plos Medicine 2005

HIV+

HIV-

Total PYs

Circumcised

20

-

2,354

Not circumcised

49

-

2,339

Total (Margins)

69

-

4693

 

  • Johannesburg, SA 2003
  • 3274 men aged 18-24
  • Half circumcised at 0 months, other half offered at end

37 of 51

Goals

  • Define Epidemiology
  • Discuss metrics of epidemiology
    • Measures of disease
    • Measures of effect
  • Describe basic study designs used in epidemiology
  • Discuss error, bias and confounding
  • Compare and contrast statistical models and dynamic models

37

38 of 51

Random Error

  • How many people must be in a study for the measure of effect to believable?

38

  • Statistical Approach:�Assign probabilities to our findings being a product of random error rather than a real phenomenon.

Estimate =

39 of 51

Bias

Difference between observed value and true value due to all causes other than random error.���

Bias does not go away with greater sample size!

Bias must be dealt with during study design!�

39

40 of 51

Selection Bias

Error due to systematic differences between those who take part in the study and those who do not.

John Last, Dictionary of Epidemiology�

40

Information Bias

A flaw in measuring exposure or outcome data that results in different quality (accuracy) of information between comparison groups.

John Last, Dictionary of Epidemiology

41 of 51

Survivor Bias

41

42 of 51

42

Non-response Bias

43 of 51

Confounding

43

Literacy

HIV Status

HIV+

HIV-

Literate

660

340

Illiterate

180

820

What if some of the study population were much younger than others?

 

44 of 51

Confounding

44

Pooled

HIV+

HIV-

Literate

660

340

Illiterate

180

820

6-15 years old

HIV+

HIV-

Literate

30

270

Illiterate

90

810

16-24 years old

HIV+

HIV-

Literate

630

70

Illiterate

90

10

6-15 year olds: Literacy = 300/1200 = 25%�

16-24 year olds: Literacy = 700/800 = 87.5%

 

 

 

45 of 51

Confounding

45

Literacy

HIV Status

HIV+

HIV-

Literate

660

340

Illiterate

180

820

Age

CONFOUNDING

 

 

 

46 of 51

How do you deal with confounding?

  • Study Design
    • Randomize participants to treatment groups (RCTs)
    • Restrict participants to one group of the confounders

  • Study Analysis
    • Multivariable modeling statistically adjusts for effect of one risk factor, given the presence of a confounder
    • Stratification conducts a separate analysis for each level of the confounder

46

47 of 51

Goals

  • Define Epidemiology
  • Discuss metrics of epidemiology
    • Measures of disease
    • Measures of effect
  • Describe basic study designs used in epidemiology
  • Discuss error, bias and confounding
  • Compare and contrast statistical models and dynamic models

47

48 of 51

48

By developing dynamic models in a probabilistic framework, we can account for dependence, random error, and bias while linking patterns at multiple scales.

Statistical Models

Dynamical Models

  • Assume independence of individuals
  • Explicitly focus on inter-dependence of individuals (X transmitted disease to Y)
  • Account for random error and bias in data to assess relationships and find correlations that may imply causality
  • Explicitly model (hypothetically ‘declare’) specific mechanisms, to explore their interactions

49 of 51

Questions in Epidemiology

  • Is male circumcision effective against HIV acquisition?

49

Statistical Models

Dynamic Models

  • By how much would population level HIV incidence decrease if we circumcise 50% of all men?

50 of 51

Summary

  • Epidemiology seeks to measure and understand the causes of disease
  • Epidemiologists conduct observational and experimental studies to obtain data
    • Incidence and prevalence
    • RR’s, OR’s, attributable risk
  • Epidemiological studies are subject to bias, confounding, and variability
  • Traditionally, focus on discovering correlations that may lead to identification of causal factors
  • Dynamic models, informed via epidemiologic studies, can answer questions traditional studies can’t

50

51 of 51

Title: Study Design and Analysis in Epidemiology: Where does modelling fit?

https://figshare.com/collections/International_Clinics_on_Infectious_Disease_Dynamics_and_Data/3788224

000

000

000

000

This presentation is made available through a Creative Commons Attribution-Noncommercial license. Details of the license and permitted uses are available at� http://creativecommons.org/licenses/by/3.0/

Attribution:

Clinic on the Meaningful Modeling of Epidemiological Data

Source URL:

For further information please contact figshare@ici3d.org.

© 2014-2021 International Clinics on Infectious Disease Dynamics and Data