Clinical Data Analysis Using OpenAI API
Korean Genome Organization | 27 Sep 2025
Seunggeun Lee
Clinical Data Analysis using OpenAI API
Contents
2
2
Clinical Data Analysis using OpenAI API
1. Understanding APIs
3
3
Clinical Data Analysis using OpenAI API
1. Understanding APIs: What is an API
4
4
Client
Server
Request
Output
Clinical Data Analysis using OpenAI API
https://www.oracle.com/kr/cloud/cloud-native/api-management/what-is-api/
1. Understanding APIs: Key Functions of an API
5
5
Clinical Data Analysis using OpenAI API
1. Understanding APIs: What is the OpenAI API
6
6
Clinical Data Analysis using OpenAI API
Provider | API Name | Main Models |
OpenAI | OpenAI API | GPT-5 |
Gemini API | Gemini 2.5 pro | |
Anthropic | Claude API | Claude Opus 4.1 / Claude Sonnet 4 |
xAI | Grok API | Grok‑4 |
Major Generative AI APIs
1. Understanding APIs: Key Concepts - with Practical Understanding(1)
7
7
Clinical Data Analysis using OpenAI API
→ OpenAI also provides embedding models.
2. Embeddings
https://platform.openai.com/docs/guides/text
https://platform.openai.com/docs/guides/embeddings
1. Understanding APIs: Key Concepts - with Practical Understanding(2)
8
8
Clinical Data Analysis using OpenAI API
3. Tokens
→ Misunderstanding tokens can lead to:
Unexpected charges, input length errors, or incomplete outputs.
https://platform.openai.com/tokenizer
1. Understanding APIs: OpenAI API Pricing
Graduate School of Data Science Master’s Thesis
9
9
Clinical Data Analysis using OpenAI API
1. Understanding APIs: How to use OpenAI API
Graduate School of Data Science Master’s Thesis
10
10
Clinical Data Analysis using OpenAI API
→ You won’t be able to view it again later
1. Understanding APIs: How to use OpenAI API
Graduate School of Data Science Master’s Thesis
11
11
Clinical Data Analysis using OpenAI API
https://platform.openai.com/chat/edit?models=gpt-4o
[Python Example with OpenAI API]
[OpenAI Platform Example with OpenAI API]
2. Advanced Topics
12
12
Clinical Data Analysis using OpenAI API
2. Advanced Topics: Retrieval-Augmented Generation (RAG)
Graduate School of Data Science Master’s Thesis
13
13
Clinical Data Analysis using OpenAI API
https://www.bentoml.com/blog/building-rag-with-open-source-and-custom-ai-models
2. Advanced Topics: Tool call
Graduate School of Data Science Master’s Thesis
14
14
Clinical Data Analysis using OpenAI API
https://python.langchain.com/docs/concepts/tool_calling/
2. Advanced Topics: What is an Agent?
15
Step 1: Doctor uploads Clinical notes
Step2: LangChain applies prompt template
Step 3: LLM processes unstructured data
Step 4: Output Summary
A: Clinical data
(EHR, labs, demographics)
B: Summarize Clinical Notes
(API call)
C. Extract Diagnoses
(API call)
If Low Risk:
Proceed to Discharge Report
Generate Discharge Report
(LLM API)
B. Extract Medications
(API call)
D. Extract Lab Finding
(API call)
E. Risk Stratification
If High Risk,
Generate Alert for Physician
If Medium Risk, Request Additional Test
Re-check Results
Final Discharge Summary
2. Advanced Topics: Local LLM
Graduate School of Data Science Master’s Thesis
16
16
Clinical Data Analysis using OpenAI API
2. Tutorial (1): Clinical Note Analysis
Graduate School of Data Science Master’s Thesis
17
17
Clinical Data Analysis using OpenAI API
2. Tutorial (1): Understanding Clinical Notes
Graduate School of Data Science Master’s Thesis
18
18
Clinical Data Analysis using OpenAI API
https://www.mindbowser.com/how-to-improve-efficiency-when-writing-clinical-notes-in-ehr/
2. Tutorial (1): Styles and Characteristics of Clinical Notes
Graduate School of Data Science Master’s Thesis
19
19
Clinical Data Analysis using OpenAI API
#. T2DM since 2018
PHx of CKD stage 3b (baseline Cr ~1.9)
C-peptide 2.2 (2022.1), GAD Ab (-)
Started metformin 2024.6 → lactic acidosis + hypoglycemia → 중단
AG metabolic acidosis on admission, resolved with IVF
#. HTN
BP stable on admission (90/54), later normalized with volume repletion
Amlodipine maintained
#. CKD
Cr 2.3 on admission (baseline ~1.9), improved to 1.7 by discharge
eGFR 32 → no dialysis needed
#. Depression
Sertraline 재시작, no suicidal ideation
Mood improved as delirium resolved
History of Present Illness:
___ female with history of T2DM, CKD stage 3, hypertension, and osteoarthritis who presented from skilled nursing facility with nausea, poor oral intake, and altered mental status per caregiver report.
Patient had reportedly been having progressive fatigue over the past week….
….
Past Medical History:
1. Type 2 Diabetes Mellitus
2. Chronic Kidney Disease Stage 3b (baseline Cr ~1.9)
3. Hypertension
4. Osteoarthritis (knees, spine)
5. Depression
….
Social History:
…
Family History:
Mother - diabetes, dementia
Father - unknown
Physical Exam on Admission:
Vitals: T 97.1 BP 90/54 HR 96 RR 20 SpO2 94% RA
…
Hospital Course:
# Metformin-Associated Lactic Acidosis:
Patient presented with lethargy, hypoglycemia, and high AG acidosis. Noted to have renal dysfunction and was recently started on metformin. Suspected metformin-associated lactic acidosis (MALA)....
# T2DM:
Initially hypoglycemic, requiring D50 and glucose monitoring. All antihyperglycemics held initially. Later transitioned to low-dose basal inslin.
# CKD:
Known baseline Cr ~1.9, presented with Cr 2.3. Likely prerenal component due to volume depletion. IVF improved renal function. Electrolytes monitored.
…
Discharge Medications:
1. Lantus 8 units SC QHS
2. Sertraline 50 mg PO QAM
3. Calcium carbonate 500 mg PO BID
…
Clinical Note Example 1
Clinical Note Example 2
2. Tutorial (1): The Importance of Clinical Note Analysis
Graduate School of Data Science Master’s Thesis
20
20
Clinical Data Analysis using OpenAI API
Examples of Tasks via OpenAI API
Drug: Prednisone, Symptom: Melena
“Melena after Pd 7.5mg qd” → Possible ADR
“After starting metformin, patient developed nausea” → Time-linked causality
Convert fragmented notes into structured summaries
Detect worsening/improving symptoms over time
“Likely due to statin” → Infer drug-outcome relationship
Tasks
Examples
2. Tutorial (2): Building Gene Embeddings
21
Clinical Data Analysis using OpenAI API
2. Tutorial (2): What is an embedding?
22
Clinical Data Analysis using OpenAI API
An embedding is a representation of information into numbers in a vector space that captures the original meaning.
Objects represented by similar vectors share more semantic meaning and are closer together - e.g. “liver” is closer to “lung” than it is to “bicycle.”
Embeddings are useful for tasks in which similarity or relationships must be measured, such as search and clustering.
2. Tutorial (2): What is a gene embedding?
23
Clinical Data Analysis using OpenAI API
Gene2vec: distributed representation of genes based on co-expression�(Du, J. et al, 2019)
2. Tutorial (2): Make gene-embedding using GPT (GenePT)
24
Clinical Data Analysis using OpenAI API
https://www.biorxiv.org/content/10.1101/2023.10.16.562533v2