Characterizing Information Seeking Events in Health-Related Social Discourse
Omar Sharif, Madhusudan Basak, Tanzia Parvin, Ava Scharfstein, Alphonso Bradham, Jacob T. Borodovsky, Sarah E. Lord, Sarah M. Preum
Motivation
2
I just had knee surgery about a week ago, and my pain meds (Gabapentin) are not cutting it! I am having sleep troubles and getting pretty anxious about recovery. I am considering Kratom. Any suggestions about how much I can take per day and how often?
Challenges
3
Difficulty to identify relevant events for OUD treatment
Hard to annotate domain-specific complex data.
Lack of prior work in computational health.
1
2
3
Contributions
4
Resource
Social Impact
Benchmarking
TREAT-ISE: Treatment Information-Seeking Event Dataset
5
Information-Seeking Events
6
Event Type | Definition |
Accessing MOUD (AM) | Events related to accessing (insurance, pharmacy) MOUD. |
Taking MOUD (TM) | Events related to timing, dosage, frequency of taking MOUD |
Experiencing Psychophysical Effects (EP) | Events related to concern about potential physical and/or psychological effects during recovery. |
Relapse (RL) | Events talk about relapsing during recovery. |
Tapering MOUD (TP) | Event asking about reducing or quitting MOUD. |
TREAT-ISE Development Steps
Figure: Information-seeking event dataset development steps.
7
TREAT-ISE Statistics
Table: Sample data excerpt with titles, posts, and labels (shortened and paraphrased as per IRB guidelines).
Table: Summary of different classes in TREAT-ISE.
8
Title | Post | Events |
Looking for suboxone guidance? | I take 1-2 mg subs per day which is a decrease from the original dose of 8mg. Just looking for a plan of action in which to stick with to eventually get off completely. | Taking MOUD (TM), Tapering (TP) |
Benchmarking: Methods & Results
9
Methods
10
Experimental and Evaluation Setup
11
Results
Table: Performance comparison of non-transformer and transformer models on TREAT-ISE.
12
Method | Classifier | Precision | Recall | F1-Score |
Non-transformer Baselines | LR | 0.653 | 0.597 | 0.593 |
NBSVM | 0.592 | 0.662 | 0.602 | |
FastText | 0.715 | 0.690 | 0.624 | |
BiGRU | 0.628 | 0.693 | 0.702 | |
Transformer Baselines | BERT | 0.809 | 0.679 | 0.733 |
RoBERTa | 0.755 | 0.768 | 0.757 | |
ELECTRA | 0.779 | 0.731 | 0.748 | |
XLNet | 0.775 | 0.780 | 0.774 | |
MPNet | 0.768 | 0.740 | 0.751 |
Results
Table: Performance of ChatGPT (GPT-3.5) on TREAT-ISE. The shorthand indicates ZS-S, ZS-L: Zero-shot (Short, Long), FS-S, FS-L: Few-shot (Short, Long), and CoT: Chain-of-Thought prompting.
13
Method | Classifier | Precision | Recall | F1-Score |
Best model | XLNet | 0.775 | 0.780 | 0.774 |
ChatGPT Baselines | ChatGPT (ZS-S) | 0.668 | 0.407 | 0.433 |
ChatGPT (ZS-L) | 0.687 | 0.550 | 0.581 | |
ChatGPT (FS-S) | 0.497 | 0.824 | 0.609 | |
ChatGPT (FS-L) | 0.511 | 0.818 | 0.620 | |
ChatGPT (CoT) | 0.559 | 0.764 | 0.631 |
Results
Table: Classwise performance for treatment information-seeking event detection.
14
Takeaways from Results
15
Ablation Studies with ChatGPT and XLNet
16
Ablation Studies
17
ChatGPT struggles more on long samples.
ChatGPT confuses between events more.
ChatGPT tends to overpredict more.
1
2
3
ChatGPT Overpredicts
Figure: Classwise overprediction ratio (#false positive / #predicted positives) of ChatGPT with CoT prompts and the XLNet model.
18
Method | Accessing MOUD (AM) | Taking MOUD (TM) | Tapering (TP) | Experiencing Psychophysical Effects (EP) | Relapse (RL) | Other |
ChatGPT (GPT-3.5) | 36/96 | 166/323 | 103/227 | 135/267 | 32/122 | 44/74 |
0.375 | 0.513 | 0.453 | 0.505 | 0.262 | 0.594 | |
XLNet | 12/75 | 35/165 | 21/139 | 92/227 | 19/155 | 9/33 |
0.160 | 0.212 | 0.151 | 0.405 | 0.122 | 0.27 |
ChatGPT Struggles on Long Samples
Figure: Correlation between sample length and frequency of correct/wrong predictions
19
ChatGPT Confuses between Events
Table. Confusion mapping of ChatGPT (CG) with CoT approach and XLNet (XL) model.
20
Key Takeaways and Future Work
21
Acknowledgements
💰💰 This research is partially supported by P30 Center of Excellence grant from the National Institute on Drug Abuse (NIDA) P30DA029926.
🏠🏠 We thank our Center for Technology and Behavioral Health (CTBH) colleagues for their guidance and insightful suggestions.
22
Thank You!
23