AI model for Speech Annotation
Beijie Liu, Rodrigo Eguiluz Ortiz Duran
Mentor: Prof. Emily Mower Provost
How AI model helps?
So you can (revision) you can change them if you want
Workflow of the project
Collaborate with
Clinicians
Identify key features
Implement the feature
Get feedback
Some audio features
So you can (revision) you can change them if you want
Audio features are measurable properties of sound that capture various aspects of audio signals
Table1: Tetzloff, K. A., Utianski, R. L., Duffy, J. R., Clark, H. M., Strand, E. A., Josephs, K. A., & Whitwell, J. L. (2018). Quantitative analysis of agrammatism in agrammatic primary progressive aphasia and dominant apraxia of speech. Journal of Speech, Language, and Hearing Research, 61(9), 2337-2346.
Table2: Catricalà, E., Boschi, V., Cuoco, S., Galiano, F., Picillo, M., Gobbi, E., ... & Cappa, S. F. (2019). The language profile of progressive supranuclear palsy. Cortex, 115, 294-308.
Web Application
Three AI Models
Audio Analysis
Speech to Text
Text Analysis
Technical Side
Speech to Text
Text Analysis
So you can you can change them if you want
RV: you can
Some Sentence-Based Features:
visualization of one feature: Revision (RV)
Future Work
Reference
Acknowledgement
Thanks Dr. Emily Mower Provost for her invaluable guidance and support.
Thanks everyone in the CHAI Lab for their collaboration and encouragement throughout this project.
Technical Side
Whisper
NLTK
Spacy
Webrtcvad: make segmentations for sentences
Bird, Steven, Edward Loper and Ewan Klein (2009). Natural Language Processing with Python. O'Reilly Media Inc. URL: https://www.nltk.org/
Van Rossum, A. (2023). Natural Language Processing With spaCy in Python. Real Python. Retrieved from https://realpython.com/natural-language-processing-spacy-python/