1 of 40

Trustworthy Generative AI for Mental Health

Manas Gaur (manas@umbc.edu)

2 of 40

Knowledge-infused Learning

Manas Gaur

Amit P. Sheth

3 of 40

Focus

  • Limitations in Generative AI:
  • Knowledge Gaps in AI
    • Reasoning
    • Rationality
    • Procedural Knowledge
    • Attribution
  • Neurosymbolic AI In fulfilling the gap and achieving trustworthiness
  • Case Study of Mental Health

4 of 40

Recent Case of Character.ai

https://apnews.com/article/chatbot-ai-lawsuit-suicide-teen-artificial-intelligence-9d48adc572100822fdbc3c90d1456bd0

5 of 40

How does Generative AI works?

Yang, K., Zhang, T., Kuang, Z., Xie, Q., Huang, J., & Ananiadou, S. MentaLLaMA: interpretable mental health analysis on social media with large language models. WWW 2024

6 of 40

Where does the problem lies?

Special Report: Are You Ready for Generative AI in Psychiatric Practice?

7 of 40

AI language systems are showing potential to handle fundamental elements of psychological treatment, encompassing everything from diagnostic assessments to therapeutic education and intervention strategies.

Stade et al. Large language models could change the future of behavioral healthcare: a proposal for responsible development and evaluation, NPJ Mental Health Research, 2024

8 of 40

Still…

  • Yet to date, clinical LLM products and prototypes have not demonstrated anywhere near the level of sophistication required to replace psychotherapy.
  • For example, while an LLM can generate an alternative belief in the style of CBT, it remains to be seen whether it can engage in the type of turn-based, Socratic questioning that would be expected to produce cognitive change.
  • This more generally highlights the gap that likely exists between simulating therapy skills and implementing them effectively to alleviate patient suffering.

9 of 40

Ethical Concerns in LLMs

10 of 40

Can we use the model itself to�remove harmful outputs?

  • Constitutional AI: https://arxiv.org/pdf/2212.08073.pdf
  • Demo: https://delphi.allenai.org/

11 of 40

Can we use the model itself to�remove harmful outputs?

23 RULES determined by the researchers to define SAFETY

Threatening statements, Making gender-specific comments, Offer financial advice

16 RULES on helpfulness, honesty, and harmlessness

12 of 40

Can we use the model itself to�remove harmful outputs?

Which of these assistant responses is less harmful?

Choose the assistant response that demonstrates more ethical and moral awareness without sounding excessively condescending, reactive, annoying or condemnatory.

Please consider these assistant responses and choose the one that is less harmful, more age-appropriate, and more family-friendly.

Please attribute your responses for provenance.

Feedback based on an assessment of 16 Rules

13 of 40

Model Rules are different than Human Rules

Semantic Consistency: the ability to make consistent decisions in semantically equivalent contexts. i.e, Semantically equivalent questions should yield semantically equivalent answers

Claim: LLMs are not semantically consistent, and can give contradictory answers to paraphrased questions

14 of 40

Model Rules are different than Human Rules

Semantic Consistency: the ability to make consistent decisions in semantically equivalent contexts. i.e, Semantically equivalent questions should yield semantically equivalent answers

Claim: LLMs are not semantically consistent, and can give contradictory answers to paraphrased questions

15 of 40

Knowledge Gaps in LLMs

Bajaj, Goonmeet, Bortik Bandyopadhyay, Daniel Schmidt, Pranav Maneriker, Christopher Myers, and Srinivasan Parthasarathy. "Understanding knowledge gaps in visual question answering: Implications for gap identification and testing." In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 386-387. 2020.

16 of 40

Reasoning Gap

17 of 40

Reasoning Gap

Data Transformations affect the learnability of Large Language Models. Question and Answering task, which was considered to be the go-to task for training large language models like ChatGPT, is the weakest form of training a model. Analogy and Odd-one-out are the best form of training the model.

18 of 40

Rationality Gap

19 of 40

Procedural�Gap

20 of 40

Procedural Gap

21 of 40

Attribution Gap

22 of 40

Different LLMs give Different Responses to Same Query

Given a Document, annotate with

appropriate references

23 of 40

Different LLMs give Different Responses to Same Query

Given a Document, annotate with

appropriate references

24 of 40

Different LLMs give Different Responses to Same Query

Given a Document, annotate with

appropriate references

25 of 40

Different LLMs give Different Responses to Same Query

Ensemble of LLMs would yield better outcomes

26 of 40

Explanatory Gap

LLM Explanations cannot focus

on what user wants

Opaque LLMs are Unexplainable

27 of 40

Opaque LLMs are Unexplainable

We desire User-level Explainability

(a) Step by Step Process

(b) Focus on important and

domain-specific concepts

28 of 40

Explainability for people, not just algorithm designers and developers

29 of 40

INDEPENDENT LLMS LACK

RELIABILITY

Challenges

LACK OF CONSISTENCY

LLMs LACK USER-LEVEL

EXPLAINABILITY

LLMs ARE UNSAFE

EMPOWER LLMS

THROUGH PROACTIVE INQUIRY

ADDRESS “CRES” FOR TRUSTWORTHINESS

Bottlenecks for Trustworthy AI

30 of 40

NeuroSymbolic AI

NeuroSymbolic AI

31 of 40

NeuroSymbolic AI

32 of 40

Input

Attention Matrix

Explanation

Prediction

Input

Attention Matrix

Explanation

Prediction

Definitions In-Context Learning

Design 1

Design 2

33 of 40

Input

Attention Matrix

Explanation

Prediction

Questionnaire

Workflow-based

In-Context Learning

Design 4

Input

Attention Matrix

Explanation

Prediction

Chain of Thoughts with Definitions

Design 3

34 of 40

NeuroSymbolic Architecture

This Knowledge-Infused Learning based Neurosymbolic architecture consists of three components: B1 (Semantic Gap Management) gathers and filters data, B2 (Metadata Scoring) generates classification labels via semantic mapping, and B3 (Adaptive Classifier Training) uses metadata-enhanced data for accurate labeling. Drawing on 12 billion tweets, 2.5 million Reddit posts, 700,000 news articles, and multiple knowledge bases like DAO and SNOMED-CT, this setup supports real-time mental health sentiment analysis.

35 of 40

35

Really struggling with my bisexuality which is causing chaos in my relationship with a girl. Being a fan of LGBTQ community, I am equal to worthless for her. I’m now starting to get drunk because I can’t cope with the obsessive, intrusive thoughts, and need to get out of my head.

Don’t want to live anymore. Sexually assault, ignorant family members and my never ending loneliness brights up my path to death.

I do have a potential to live a decent life but not with people who abandon me. Hopelessness and feelings of betrayal have turned my nights to days. I am developing insomnia because of my restlessness. I just can’t take it anymore. Been abandoned yet again by someone I cared about. I've been diagnosed with borderline for a while, and I’m just going to isolate myself and sleep forever.

Really struggling with my bisexuality which is causing chaos in my relationship with a girl. Being a fan of LGBTQ community, I am equal to worthless for her. I’m now starting to get drunk because I can’t cope with the obsessive, intrusive thoughts, and need to get out of my head.

Don’t want to live anymore. Sexually assault, ignorant family members and my never ending loneliness brights up my path to death.

I do have a potential to live a decent life but not with people who abandon me. Hopelessness and feelings of betrayal have turned my nights to days. I am developing insomnia because of my restlessness. I just can’t take it anymore. Been abandoned yet again by someone I cared about. I've been diagnosed with borderline for a while, and I’m just going to isolate myself and sleep forever.

δ = 1.0 (No Knowledge)

δ = 0.84 (16% knowledge)

Interpretability with Semi-Deep Infusion

Really struggling with my bisexuality which is causing chaos in my relationship with a girl. Being a fan of LGBTQ community, I am equal to worthless for her. I’m now starting to get drunk because I can’t cope with the obsessive, intrusive thoughts, and need to get out of my head.

Don’t want to live anymore. Sexually assault, ignorant family members and my never ending loneliness brights up my path to death.

I do have a potential to live a decent life but not with people who abandon me. Hopelessness and feelings of betrayal have turned my nights to days. I am developing insomnia because of my restlessness. I just can’t take it anymore. Been abandoned yet again by someone I cared about. I've been diagnosed with borderline for a while, and I’m just going to isolate myself and sleep forever.

δ = 0.71 (29% knowledge)

Expert Evaluation Agreement: 84%

Really struggling with my bisexuality which is causing chaos in my relationship with a girl. Being a fan of LGBTQ community, I am equal to worthless for her. I’m now starting to get drunk because I can’t cope with the obsessive, intrusive thoughts, and need to get out of my head.

Don’t want to live anymore. Sexually assault, ignorant family members and my never ending loneliness brights up my path to death.

I do have a potential to live a decent life but not with people who abandon me. Hopelessness and feelings of betrayal have turned my nights to days. I am developing insomnia because of my restlessness. I just can’t take it anymore. Been abandoned yet again by someone I cared about. I've been diagnosed with borderline for a while, and I’m just going to isolate myself and sleep forever.

δ = 0.66 (34% knowledge)

36 of 40

Results

The tables compare model performance for mental health classification across Precision, Recall, and F1-Score. The left table shows traditional models’ results with and without the Neurosymbolic approach, while the right table contrasts the Neurosymbolic model with state-of-the-art LLMs like LLama, Phi, and Mistral.

The Neurosymbolic model consistently outperforms both traditional models and state-of-the-art LLMs, achieving higher performance metrics and adaptability in mental health sentiment classification.

37 of 40

Grounding with Knowledge Graph and Document

Instructability with Evaluator

Explainability with Proactive

Inquiry

NeuroSymbolic LLMs: First Attempt

Reward

Generator T5-Large

Evaluator T5-base

Subgraph

Doc Retrieval

Proactive Inquiry

initiated by LLM

38 of 40

Open Questions: Hallucinations

An overview of psychological phenomena and cognitive biases in humans and their parallel in LLMs

Berberette, E., Hutchins, J., & Sadovnik, A. (2024). Redefining" Hallucination" in LLMs: Towards a psychology-informed framework for mitigating misinformation. arXiv preprint arXiv:2402.01769.

39 of 40

Open Questions

  • How do we enforce instructions following in Generative AI?
  • Do we require large language models for mental health-like high-risk domains?
  • What are the best data-designing strategies to enforce better learning in AI?
  • Should we worry about bias and ethics in Generative AI?
  • How can we make AI safe?

40 of 40

Thank You for Your Attention