BACKGROUND
RESULTS
Efficacy of ChatGPT vs. Cochrane Summaries on Hepatitis B: A Readability Study
Andre Ho1, Angelo Cadiente1, Jamie Chen1, Amber W. Chan1, Andrew S. Boxer2
1Hackensack Meridian School of Medicine, 123 Metro Blvd, Nutley, New Jersey 07110, USA
2Hackensack University Medical Center, 799 Bloomfield Avenue, Suite 111, Verona, NJ 07044
LIMITATIONS
REFERENCES
Search | Cochrane Library. (n.d.). Gastroenterology & Hepatology in Cochrane Topic. Retrieved October 1, 2023, from https://www.cochranelibrary.com/
Each summary was also evaluated by two blinded, independent graders on a 5-point scale for accuracy and adherence to the abstract, with their combined grades compared between datasets.
METHODS
Metrics & Grades | Cochrane Plain Text Summaries | ChatGPT-3.5 Generated Summaries | P-Value |
Flesch Kincaid Reading Ease | 23.81 (11.71) | 23.16 (9.93) | 0.816 |
Flesch Kincaid Grade Level | 14.74 (1.70) | 14.79 (1.89) | 0.910 |
Gunning Fog Score | 17.53 (1.97) | 18.00 (2.16) | 0.373 |
Smog Index | 12.69 (1.40) | 13.13 (1.52) | 0.249 |
Coleman Liau Index | 17.02 (1.99) | 17.03 (2.00) | 0.985 |
Automated Readability Index | 14.55 (1.72) | 14.41 (2.62) | 0.797 |
Summative Grade | 3.79 (0.87) | 4.34 (0.61) | 0.00575 |
Table 1: Mean & Standard Deviation of Readability Metrics and Grades between Cochrane and ChatGPT.
Cochrane Library��31 abstracts tagged “Hepatitis B”
ChatGPT-3.5 summaries compared with corresponding Cochrane Plain Text Summary
Readability
Summarized with ChatGPT-3.5 (September 25 Version)
Assessed with 6 metrics - each describing amount of formal education required to understand given response
Statistical Analysis
Two-tailed t-test to compare ChatGPT-3.5 generated summaries & Cochrane Plain Text Summaries
CONCLUSION
DISCUSSION