1 of 59

OpenHIE

OpenHIE Academy Course 320

Identity Management

February 2019

2 of 59

3 of 59

3

Cleaning and Standardisation

Cleaning and Standardisation

DATABASE B

DATABASE A

Blocking/Indexing

RECORD PAIR COMPARISON

SIMILARITY VECTOR CLASSIFICATION

Matches

Non-Matches

Possible Matches

Clerical Review

Evaluation

4 of 59

Course Instructors and Contributors

  • Priyanka Sanikommu, MS Health Informatics, SOIC, IUPUI.
  • Jennifer Shivers, Regenstrief Institute, Global Health Advisor,
  • Dr. Shaun Grannis, Vice President Data and Analytics at Regenstrief Institute, Inc.
  • Dr. Richard Stanley, IntraHealth International

4

5 of 59

Key Audience(s)

  • Health IT Professionals (technologists, implementation leads and business analysts)
  • MOH / HIT Leadership (policy makers and decision makers)
  • Anyone interested in patient identity linking or matching

5

6 of 59

Course Learning Objectives

By the end of this course, learners will be able to:

  1. Understand the types of challenges that identity management and client record linking and client registries can address (Value of having client linking / client registry)
  2. Understand definitions related to Identity Management and Client Registries
  3. Understand basic features of a client registry
  4. Discuss the different matching algorithms typically used in matching or linking patient records from different systems
  5. Understand typical transactions that a Client Registry should support
  6. Provide examples of Client registry tools
  7. Understand some frequent implementation issues

6

7 of 59

Section 2 - Identity Management Overview

7

8 of 59

Patient Identity: History

… Each person in the world creates a book of life. The book starts with birth and ends with death. It’s pages are made up of all the principal events in life. Record linkage is the name given to the process of assembling the pages of this book into one volume. The person retains the same identity throughout the book. Except for advancing age, he is that same person …”

- Dunn, 1946

9 of 59

Client Registry Definition

Sometimes called a Master Patient Index(MPI)

A client registry (CR) checks the patient from multiple sources and matches/links demographic data. A client registry supports the ability to manage patients, monitor outcomes, and conduct case-based surveillance by facilitating information exchange through:

  • supporting exchange of patient information between disparate systems.
  • holds patient identifiers and may include patient demographic information.
  • links identities across systems

9

10 of 59

Patient Matching: Synonyms and Definition

  • “Patient Matching” → “Patient Linkage”
  • “Record Matching” → “Record Linkage”
  • “Identity Management”

10

  • Entities are typically individual persons, but can be facilities, health workers, organizations, etc.
  • Records contain fields describing the entity.
  • These fields can include: “Unique” ID’s, Names, birth dates, addresses, Sex, Parents’ names, tribe, telephone numbers, etc

Identify records that represent the same entity (patient).

11 of 59

Client Registry ≠ Shared Health Record

  • A Client Registry provides a unique identifier, it does not implement a Shared Health Record.
  • A shared health record (SHR) may be created for patients using identifiers in the Client Registry.
  • There may exist separate governance, privacy, security, and other requirements for a Shared Health Record.

11

12 of 59

Basic features of CR

  • Entity matching - Identify duplicate patients
    • configurable algorithms allow for matching schemes to be tailored to the needs of the context
  • Assigns and looks-up unique patient identifiers,
  • Maintains a central registry of patients and their demographic data,
  • Allows connections from diverse point of service systems, such as electronic medical record (EMR) systems,
  • Links patient records and breaks links.

12

13 of 59

What is in a Client Registry?

What may be stored

  • Patient, system identifiers
  • Demographic information
    • Date of birth
    • Place of birth
    • Patient Names
    • Gender
    • Addresses
    • Marital Status
    • Telephone / email addresses
    • Multiple birth status/order
    • Death status / date
  • Relationships
    • Mother/father/next of kin
  • Facility the patient demographics came from

What is not stored

  • Patient conditions
    • E.g. HIV Status
    • Diagnoses / Chronic Conditions
    • Allergies
  • Encounter data
    • Visit information
    • Appointments
    • Physician information
  • Facility information
    • Beyond the facility a patient is assigned to

13

This slide shows some of the key attributes that may be included in a client registry. There may be some facility information in a Client Registry. This is because, to differentiate where that information came from and to distinguish between different source of identity.

some of the attributes that might be in a shared health record are shown on the right side of the slide. This is to help you understand the difference between a client registry and a shared health record. Note that the client registry focuses on attributes that help identify the patient and the shared health record focuses on creating a longitudinal record of the Patient’s care

14 of 59

Section 3 - How does a CR Work?

14

15 of 59

Types of challenges That client registry and client record linkage can address:

  1. Deduplication (within a single system)
  2. Linkage across systems
    1. To link a patient’s records across EMR / Community systems.

For example: challenge to match a patient who has got their name recorded different in different hospital registers.

    • To link lab results for a patient with their EMR data
    • To tabulate health system metrics across systems

15

16 of 59

Silent Transfers

16

January - ART Clinic

March - ART Clinic

17 of 59

Multiple Facilities / Client Identities

17

ID

FN

GN

DOB

NIN

456

BLESSING

MAKUMBE

09-09-84

CF-000-123-456

ID

FN

GN

DOB

NIN

2001

MAKUMBE

BLESSING

09-08-84

CF- 000-123-456

ID

FAM

GIV

WARD

340393-302921-1974S

MAKUMBE

BLESS

Kaloleni

ANC Clinic

ART Clinic

Hospital

18 of 59

Multiple Identities and Monitoring

18

ANC Clinic

ART Clinic

Hospital

Data Manager

ID

FN

GN

DOB

NIN

456

BLESSING

MAKUMBE

09-09-84

CF-000-123-456

ID

FN

GN

DOB

NIN

2001

MAKUMBE

BLESSING

09-08-84

CF- 000-123-456

ID

FAM

GIV

WARD

340393-302921-1974S

MAKUMBE

BLESS

Kaloleni

District Data Manager

Duplicate identities in the monitoring system

19 of 59

Multiple Identities and Monitoring

19

ANC Clinic

ART Clinic

Hospital

Linked CR - ID 2345

ID

FN

GN

DOB

NIN

456

BLESSING

MAKUMBE

09-09-84

CF-000-123-456

ID

FN

GN

DOB

NIN

2001

MAKUMBE

BLESSING

09-08-84

CF- 000-123-456

ID

FAM

GIV

WARD

340393-302921-1974S

MAKUMBE

BLESS

Kaloleni

District Data Manager

Unique Identity for the Same Person

20 of 59

Uniquely Identify Patients in Labs

20

Clinic EMR

ART lab

ID

FN

GN

DOB

NIN

456

BLESSING

MAKUMBE

09-09-74

CF-000-123-456

ID

FN

GN

DOB

NIN

222

BLESSING

M

09-09-74

CF-000-123-456

Two records are not linked

21 of 59

Use cases

21

Vaccine clinic 1

Vaccine clinic 2

Vaccine clinic 3

ID

FN

GN

DOB

NIN

456

Blessing

MAKUMBE

09-09-74

CF-000-123-456

ID

FN

GN

DOB

NIN

248

BLESSING

Aleya

07-10-99

CF-000-123-456

Not registered - Free vaccination program

22 of 59

Identity Management Motivation

22

  • Clinical information is fragmented across many independent databases using different identifiers
  • Fragmented data directly impacts challenging for such uses as:
    • Treatment, Payment and Operations (TPO)
    • Public Health/Administrative Reporting
    • Outcomes management
    • Vital status determination
    • Research

23 of 59

Section 4 - Patient Matching

23

24 of 59

Barriers to Accurate Patient Matching

  • Recording Errors
    • Phonetic (“Shaun”, “Sean”, “Shawn”)
    • Typographical�(Smith → Snith, “07” → “01”)
  • Changing Identifiers
    • Last Name (Marriage)
    • Geographic location (Home address, etc)
  • Sharing Identifiers (Country based identification numbers ex: SSN (USA) etc.)
  • Identifiers Limited or Unavailable (missing)

24

There may be other factors unique to your context.

25 of 59

Ideal Identifier Characteristics

  • Unique�(eg, fingerprint, Iris, DNA, National ID)
  • Ubiquitous�(eg, Name, DOB, Sex, Eye Color)
  • Unchanging�(eg, DOB, Sex, Given Name, DNA)
  • Uncomplicated�(eg, Name, DOB, Sex)
  • Uncontroversial�(eg, avoid sensitive data)
  • Easily and Inexpensively Accessible

25

No identifier meets all of these characteristics

26 of 59

Potential Identification solutions

  • Universal patient identifier (UPI)
  • Pros:

Can simplify matching and

increase matching accuracy.

  • Cons:

Recording errors

Sharing ID’s

Lost ID’s

Controversial (in some contexts)

26

Unique

+/- Ubiquitous

Unchanging

+/- Uncomplicated

❌ Uncontroversial

+/- Easy/Inexpensive

access

27 of 59

Potential Identification Solutions

  • Biometric Identifiers
    • Examples: fingerprint, voice, retinal,
    • vein scan
    • Facial recognition.
    • Pros:
      • Highly specific to an�individual (unique?)
    • Cons:
      • Require proprietary hardware for all

data generators

      • Privacy/Security concerns

27

Unique

Ubiquitous

Unchanging

+/- Uncomplicated

❌ Uncontroversial

❌ Easy/Inexpensive

access

28 of 59

Potential Identification Solutions

  • Patient matching algorithms:
    • Establish identity using multiple�patient attributes
    • Pros:
      • Leverages existing data
      • Does not require biometrics�or UPI
    • Cons:
      • Imperfect: False positives/negatives
      • Accuracy dependent on data quality

28

+/- Unique

Ubiquitous

+/-Unchanging

Uncomplicated

❌ Uncontroversial

Easy/Inexpensive

access

29 of 59

Patient Matching Terminology

  • True match/True link/True positive�Truly matching records declared to be the same entity
  • False match/False link/False positiveTruly non-matching records declared to be the same entity
  • True Non-match/True Non-link/True negative�Truly non-matching records not declared to be the same entity
  • False non-match/False non-link/False negative�Truly matching records not declared to be the same entity
  • We have Few more matching terminologies - Potential pairs/Links, Blocking/ Grouping, Field Agreement weight/score ….etc.

29

30 of 59

Patient Matching Terminology

30

TNM

TNM+FM

Matching System Declaration

“Truth”

True Match

True Non-Match

True Match

True �Non-Match

True Match

False �Non-Match

False Match

True �Non-Match

TM

TM+FM

“Pos Predictive Value”

or “Precision”

TNM

TNM+FNM

“Neg Predictive Value

TM

TM+FNM

Sensitivity” �or “Recall”

Specificity” �

31 of 59

Patient Matching Methodologies

31

Statistical Matching

Deterministic

Probabilistic

Increasing Complexity

Machine Learning

Hybrid Solutions

Hybrid Solutions

32 of 59

Deterministic Matching

  • ‘Rules-based’ or ‘Heuristic’
  • Accuracy is highly dependent on presence of discriminating identifiers (national or local ID, etc)
  • Rule-based, eg declare a match if exact match on:
    • National ID + DOB
    • Full Name + Address
    • etc.

32

33 of 59

Patient Matching Methodologies

Deterministic/Heuristic

  • Rapid Implementation
  • Simple calculations
  • Relies on accurate and consistent data
  • May not generalize well to other data sets

Probabilistic

  • Complex implementation
  • Computationally intensive
  • More forgiving of data errors
  • Algorithms adapt to data being linked

33

34 of 59

Fuzzy Match

  • Non-exact agreement, allows for errors:
    • “If last name agrees on first 6 characters then declare agreement”
    • “If birth date is within 1 month, then declare agreement”
  • To loosen agreement, string comparators or phonetic transformation functions may be used:
    • Soundex - Phonetic
    • NYSIIS - Phonetic
    • Levenshtein Edit Distance - Comparator
    • Jaro-Winkler Comparator - Comparator
    • Longest Common Sub-sequence - Comparator

34

35 of 59

Probabilistic/Machine Learning

  • Implements a statistical model for matching
  • A common model is Felligi-Sunter maximum likelihood model
  • Establish parameters for model using machine learning algorithms (EM) or bootstrap review
  • Maximum Entropy Model also used

35

36 of 59

Probabilistic Linkage Overview:�Human Review Thresholds

36

37 of 59

Section 5 - Architecture

37

38 of 59

Central Registry

38

  • Contains patient identifiers with pointers or links to to point-of-care sources.
  • No clinical data contained in the repository
  • Contributing data sources send patient demographics, matching can be performed in real-time or near real-time

Name

Birth Date

Sex

Source

Smith, Jane

12-Oct-1978

F

Vaccine Clinic

Jones, Fred L

07-Feb-1982

M

Hospital A

Smith, Jayne

12-Oct-1980

F

Clinic B

Williams, Mary

20-Dec-1985

F

Clinic A

Mary, Williams

20-Dec-1986

Hospital B

Jones, Freddy

01-Feb-1988

M

Clinic A

39 of 59

Patient Registry

39

Clinic A

Jill Receives Immunizations @ Vaccine Clinic

Data recorded in an immunization registry

Jill Receives Immunizations and other care (measurements, labs, diagnoses, etc) @ Clinical Practice

Data recorded in an EMR

Immunization Registry

40 of 59

Client Registry

40

Immunization Registry

Clinic A

Patient ID: 123LMNOP

Name: Minnie Mouse

hiDOB: 01/01/2011

PAN: N/A

Address: S.V.Road

City:Disney

State: Land

ZIP: 400 056

Patient ID: 6789XYZ

Name: Minnie Mouse

DOB: 01/01/2011

PAN:123-45-6789

Address: S.V.Road

City: Disney

State: Land

ZIP: 400 056

Client Registry / Central Patient Index

Global ID: 45678

Name: Minnie Mouse

Lots of Demographics..

MRF1 ID: OU81247

MRF2 ID: 4564356

IMM REG ID: 123LMNOP

CLINIC A ID: 6789XYZ

41 of 59

Client Registry

41

Clinic B

Immunization Registry

Hospital A

Hospital B

Clinic C

Central Patient Registry

Clinic A

42 of 59

Section 6 - Governance and Implementation Considerations

42

43 of 59

Matching Algorithm Implementation Recommendations

  • There is a spectrum of matching options from simple matching, exactly matching key fields, to complex matching, configuring an algorithm that establishes weights for particular data values.
  • Using too simplistic of an algorithm may result in a higher percentage of duplicate records.

43

44 of 59

Matching Algorithm Implementation

Recommended Process:

  1. Performing initial data analysis to identify candidate fields for matching and blocking
  2. Validate candidate fields
  3. Identify blocking schemes
  4. Configure / parameterize matching algorithm (deterministic or probabilistic)
  5. Evaluate results and tune the algorithms.

44

45 of 59

Client Registry Governance

Compliance and mandate:

  • Patient privacy (PII): Establish a minimum data set
  • Data locality
  • Operational governance and management
  • User Roles and Responsibilities
  • System Data Access / Use
  • Data Provenance
  • Security: System and user authentication, authorization
  • Auditability and Traceability

45

46 of 59

Key Governance Considerations

  • Terminologies
  • Application Programming Interface (APIs)
  • Algorithms and decision rules
  • Governance of customization
  • Governance of deployment

46

47 of 59

Implementation decisions

  • Put your emphasis on high quality data rather than a complex algorithm.
  • Matching accuracy can be impacted by the poor identifiers.

“Not let perfect be the enemy of good enough”.

47

48 of 59

Conditions for Implementation

What are the conditions required for a Client Registry to be implementable?

Asynchronous not synchronous

  • Requires limited connectivity, not real-time.
  • Electronic clients data, not paper client records or an EMR
  • Occasional electricity

Incremental deployment

  • Regional or a network of EMRs before national

48

49 of 59

Where to Start

  1. How are patients identified today?
    1. Is there a unique ID?
    2. Is patient demographic data electronically captured?
  2. What use cases do you want to start with with?
    • Do those require a patient registry
  3. What are some example of success?
    • See the OpenHIE
    • Getting started guide - reference here ?
  4. Governance
  5. Think about the privacy and security needs for your use case

49

50 of 59

Section 7 - Examples

50

51 of 59

Example CR Landing Page

51

52 of 59

Patient Record View

52

53 of 59

Standards

OpenHIE standards describes the data exchange standards.

PDQ - The Patient Demographics Query (PDQ) Integration Profile lets applications query a central patient information server and retrieve a patient’s demographic information.

PDQm - The Patient Demographics Query for Mobile (PDQm) Profile defines a lightweight RESTful interface to a patient demographics supplier leveraging technologies readily available to mobile applications and lightweight browser based applications. The functionality is identical to the PDQ Profile described in the ITI TF-1:8.

  • https://wiki.ohie.org/display/documents/OpenHIE+Standards+and+Profiles

53

54 of 59

Message Standards

PIX - The Patient Identifier Cross Referencing (PIX) Integration Profile supports the cross-referencing of patient identifiers from multiple Patient Identifier Domains by transmitting patient identity information from an identity source to the Client Registry.

PMIR: The Patient Master Identity Registry (PMIR) Profile supports the creating, updating and deprecating of patient master identity information about a subject of care, as well as subscribing to changes to the patient master identity, using the HL7 FHIR standard resources and RESTful transactions.

54

55 of 59

References

OpenHIE Architecture Specification

OpenHIE Identity Management Community - join the community of people engaging to provide input on tools and standards and address implementation issues.

OpenHIE Community Discussion Forum - Ask questions and engage with community members through on-line discussion

OpenHIE Standards and Profiles

OpenHIE Reference Software

55

56 of 59

Bibliography - Theory

  • Fellegi IP, Sunter SB. (1969). A Theory for Record Linkage. Journal of the American Statistical Association, 64(328), 1183-1210.
  • Dunn HL. (1946) Record Linkage. Am J Public Health. 36, 1412-1416.
  • Newcombe HB. (1988) Handbook of Record Linkage, Methods for Health and Statistical Studies, Administration, and Business. Oxford University Press.
  • Newcomb HB, Kennedy JM. Axford SJ, James AP. (1959) Automatic Linkage of Vital Records. Science, 130, 954-959.
  • Gill, L., Methods for Automatic Record Matching and Linking and their use in National Statistics. Her Majesty’s Stationary Office, Norwich, 2001.
  • Porter E, Winkler W. Approximate String Comparison and its Effect on an Advanced Record Linkage System. Record Linkage Techniques--1997: Proceedings of an International Workshop and Exposition. National Academy Press, Washington DC 1999.
  • Public Health Informatics Institute. The unique records portfolio. Decatur, GA: Public Health Informatics Institute, 2006.

56

57 of 59

Bibliography:�Applications and Research (1)

  • Christen P. Febrl: A freely available record linkage system with a graphical user interface. Submitted to the Australasian Workshop on Health Data and Knowledge Management (HDKM), Wollongong, January 2008.
  • Potosky A, Riley G, Lubitz J, et al. Potential for Cancer Related Health Services Research Using a Linked Medicare-Tumor Registry Database. Medical Care 1993;31(8):732-748.
  • Whalen D, Pepitone A, Graver L, Busch JD. Linking Client Records from Substance Abuse, Mental Health and Medicaid State Agencies. SAMHSA Publication No. SMA-01-3500. Rockville, MD: Center for Substance Abuse Treatment and Center for Mental Health Services, Substance Abuse and Mental Health Services Administration, July 2000.
  • Liu S, Wen SW. Development of Record Linkage of Hospital Discharge Data for the Study of Neonatal Readmission. Chronic Diseases in Canada 1999; 20(2):77-81.
  • Pates R, Scully W, et al. Adding Value to Clinical Data by Linkage to a Public Death Registry. MedInfo 2001;10(Pt 2):1384-8

57

58 of 59

Bibliography:�Applications and Research (2)

  • Lynch BT, Arends WL. Selection of a surname coding procedure for the SRS record linkage system. Washington, DC: US Department of Agriculture, Sample Survey Research Branch, Research Division, 1977.
  • Newman T, Brown A. Use of Commercial Record Linkage Software and Vital Statistics to Identify Patient Deaths. J Am Med Inform Assoc. 1997 May-June; 4 (3): 233-237.
  • Schadow G, McDonald CJ Maintaining Patient Privacy in a Large Scale Multi-Institutional Clinical Case Research Network. AMIA Proceedings (2002 Submission).
  • Public Health Informatics Institute. (2006). The Unique Records Portfolio. Decatur, GA: Public Health Informatics Institute
  • Sideli R, Friedman C. Validating Patient Names in an Integrated Clinical Information System. Symposium on Computer Applications in Medical Care, Washington, DC. November 1991:588-592.
  • OpenHIE Standards and Profiles - Documents. OpenHIE Wiki. (n.d.). https://wiki.ohie.org/display/documents/OpenHIE+Standards+and+Profiles.

58

59 of 59

Bibliography:�Applications and Research (3)

  • Miller PL, Frawley SJ, Sayward FG. IMM/Scrub: a domain-specific tool for the deduplication of vaccination history records in childhood immunization registries. Computers and Biomedical Research 2000;33:126–143.
  • Salkowitz SM, Clyde S. De-duplication technology and practices for integrated child-health information systems. Decatur, GA: All Kids Count, Public Health Informatics Institute, 2003.
  • Van Den Brandt PA, Schouten LJ, Goldbohm RA, Dorant E, Hunan PMH. Development of a record linkage protocol for use in the Dutch Cancer Registry for epidemiological research. Int J Epidemiol 1990; 19:553-8.
  • Grannis SJ, Overhage JM, McDonald CJ. Analysis of Identifier Performance Using a Deterministic Linkage Algorithm. Proc AMIA Symp 2002:305-9.
  • Grannis SJ, Overhage JM, McDonald CJ. Analysis of a Probabilistic Record Linkage Technique without Human Review. In: Proceedings of American Medical Informatics Association Fall Symposium; 2003; Washington, D.C.; 2003.
  • Integrating the Health Care Enterprise. (2006) Patient Identifier Cross-Reference (PIX) and Patient Demographic Query (PDQ) HL7 v3 Transaction Updates. Available at: http://www.ihe.net/Technical_Framework/upload/ IHE_ITI_TF_Suppl_PIXPDQ_HL7v3_PC_2006_08_15.pdf

59