1 of 31

Federal Data Training Call: Cases and Deaths

February 25, 2021

2 of 31

Welcome to Our Second Data Training!

  • We’ll introduce you to two federal datasets and tell you what we’ve learned about them.
  • You can put questions into chat anytime, and they’ll go to the moderators.
  • We’ll answer as many questions as we can at the end of the call.
  • These are peer trainings—we’re not part of the government! Please contact us for coherent on-the-record quotes.

3 of 31

The Data We Have Offered at CTP

Our three main datasets cover:

  1. testing, hospitalization, and outcomes data,
  2. long-term care facility cases and deaths, and
  3. race and ethnicity demographic data

4 of 31

Federal Cases Data

5 of 31

Case Data Background

  • CDC already receiving case reporting from states before COVID-19
  • Total, Confirmed, and probable cases (the simplified version)
    • Confirmed = positive molecular amplification test (commonly known as PCR tests)
    • Probable = antigen positive test, specific symptoms plus exposure, or a death certificate that lists COVID-19 as cause of death
  • Aggregate vs. line-level data

Aggregate Data:

  • Totals
  • Daily
  • Up to date

Line-level Data:

  • Line per case
  • Monthly
  • Time consuming
  • Additional demographic info

6 of 31

Where To Find It, and How to Use it

Data.CDC.gov

  • Available as CSV, XML, API and more
  • possible to filter, sort, and visualize the data before download
  • Updates twice daily

https://data.cdc.gov/Case-Surveillance/United-States-COVID-19-Cases-and-Deaths-by-State-o/9mfq-cb36

7 of 31

Additional Places to Find it

HealthData.gov: available for download as csv, rdf, json, xsl https://healthdata.gov/dataset/united-states-covid-19-cases-and-deaths-state-over-time

Featured In:

CDC Data Tracker: updated daily https://covid.cdc.gov/covid-data-tracker/#cases_casesper100klast7days

Community Profile Reports: released daily https://beta.healthdata.gov/National/COVID-19-Community-Profile-Report/gqxm-d9w9

8 of 31

What exactly is in the dataset?

9 of 31

CTP compared to the CDC data - US Total Cases

10 of 31

CTP compared to the CDC data - Total Cases by jurisdiction

11 of 31

Confirmed and Probable Case Reporting

12 of 31

What’s missing?

  • Data notes - Currently available in the downloadable Excel spreadsheet version of the Community Profile Reports (CPR)
  • Revision history - both Data.CDC.gov and HealthData.gov have this functionality available

13 of 31

The Upshot

  • The differences we’ve outlined are not meaningful blockers to using the dataset as it is today
  • It could be even better with:
    • Data notes included in the dataset
    • Increased collaboration between the federal and jurisdictional governments
  • The “United States COVID-19 Cases and Deaths by State over Time” dataset is an excellent replacement for state reported data

Thank you

14 of 31

Federal Deaths Data

15 of 31

What datasets are available at the federal level?

CDC COVID Data Tracker

National Center of Health Statistics (NCHS)

16 of 31

Comparing with CTP data

17 of 31

CDC COVID Data Tracker

18 of 31

Overview

  • This is the same dataset containing cases information

  • Information is sourced from 60 jurisdictions, which report updated figures to the CDC each day

  • Data very closely matches information on state dashboards, and CTP’s data

  • Generally, deaths are reported in confirmed, probable, and total categories.

19 of 31

What’s in the dataset? Six fields for deaths

20 of 31

Details & Caveats

  • Current reporting breakdown:
    • Confirmed and probable, as separate categories: 33 jurisdictions
    • Confirmed only: 3 jurisdictions
    • Total (appears to lump confirmed & probable together): 24 jurisdictions

  • However, deaths definitions aren’t standardized across states. Varying state criteria for categorizing deaths mean that direct comparisons can’t be made.

  • Deaths are submitted by date of report. Figures are subject to revision upon review.

  • This dataset is best suited for a snapshot-level view of the pandemic.

21 of 31

NCHS Data

22 of 31

NCHS Data

  • The NCHS takes in all death certificates submitted by states to the National Vital Statistics System (NVSS) through the National Vital Cooperative Program and then digested by the agency into count tables
  • The NHCS officially counts a death as attributed to COVID-19 when it is listed either as an underlying cause of death (Part I of a death certificate) or contributing to the cause of death (Part II of a death certificate)
  • There are substantive lags that occur between when a state submits their death certificates. Some submit daily, other might submit them weekly or monthly.

Updates: Daily, but data is provided by Week

23 of 31

NCHS Data

  • Provisional COVID-19 Death Counts by Week Ending Date and State

https://data.cdc.gov/NCHS/Provisional-COVID-19-Death-Counts-by-Week-Ending-D/r8kw-7aab

Updates: Daily, but data is provided by Week

  • Provisional COVID-19 Death Counts in the United States by County

https://data.cdc.gov/NCHS/Provisional-COVID-19-Death-Counts-in-the-United-St/kn79-hsxy

24 of 31

Caveats when using NCHS data

  • NCHS does not require laboratory confirmation to count a death as due to COVID-19.
  • The terms “confirmed” and “probable” used in NCHS communications refer to how COVID-19 is listed on the death certificates, not to the COVID-19 case definition provided by the CSTE. There is no neat mapping between COVID-19 case data and NCHS death data.

25 of 31

Caveats when using NCHS data

  • Data from NCHS are labeled as “provisional” due to delays in the process of submitting a death certificate. The NCHS cautions against comparing recent data across states since states submit their death certificates at different rates
  • Data is organized by date of death while data from The COVID Tracking Project reflects death by date of reporting. Unlike COVID Tracking Project data, historical NCHS data does not exhibit a reporting-associated “death lag”.

26 of 31

Comparing with CTP data

27 of 31

Plus more!

  • All-cause mortality and excess deaths
  • Demographic breakdown by race and ethnicity, age, and sex
  • Pre-existing conditions
  • Historical death counts
  • Deaths due to Pneumonia, Influenza, and COVID-19

28 of 31

COVID Data Tracker vs NCHS

29 of 31

Our CDC wish list

Documentation

CDC COVID Data Tracker

  • CDC should note that data is unstandardized, and provide definitions of how states use terms (e.g. confirmed, probable)
  • Centralize documentation currently spread across website

NCHS

  • Centralize documentation & label more clearly

Provide an introduction to the two datasets to help users navigate them

30 of 31

CTP Written Resources

Cases 101: bit.ly/c19cases

�Deaths 101: bit.ly/c19death101

�Confirmed vs. probable deaths: bit.ly/confvprob

Recovered metric: bit.ly/c19recovered

31 of 31

Q&A