1 of 19

Bhagesh Gaur, Karan Gupta, Aseem Srivastava, Manish Gupta, Md Shad Akhtar

Assess and Prompt: A Generative RL Framework for Improving Engagement in Online Mental Health Communities

Indraprastha Institute of Information Technology Delhi (IIIT Delhi), India

Microsoft, India

2 of 19

“Over 40% of help-seeking posts on Reddit mental health forums get no response.” (Sharma et al., 2020; Kim et al., 2023)

Even in supportive spaces, silence can deepen isolation.

Why do so many cries for help online go unanswered?

We aim to understand and bridge this communication gap.

3 of 19

  • Online forums give safe, peer-based spaces for mental health support.
  • Yet, many posts lack clarity about what happened, how it felt, and what support is needed.
  • In therapy, expressing these elements is essential to being understood.
  • We model these as Support Attributes (Event, Effect, Requirement) - signals of help-seeking clarity.

Support-seeking posts often miss key ingredients of help

Clear expression is the bridge between asking for help and receiving it.

4 of 19

Posts without clear ‘support attributes’ fail to elicit engagement

  • Online help-seeking posts often omit key support cues — what happened, how it felt, and what’s needed.

  • This lack of “support attributes” leads to lower empathy and response rates.

  • Prior NLP work focuses on empathy detection or response generation, but not on assessing and improving post clarity.

5 of 19

Posts without clear ‘support attributes’ fail to elicit engagement

  • Online help-seeking posts often omit key support cues — what happened, how it felt, and what’s needed.

  • This lack of “support attributes” leads to lower empathy and response rates.

  • Prior NLP work focuses on empathy detection or response generation, but not on assessing and improving post clarity.

We shift focus from ‘how to respond’ → to ‘how to help users express better.’

6 of 19

Including Event, Effect, Requirement in post increases the number of comments

7 of 19

Can a language model identify missing support attributes in a post and prompt the user to express them?

To study this aspect and address the gaps, we propose two major contributions:

  1. A novel dataset, REDDME, along with a taxonomy, CueTaxo, to study the engagement in posting behavior for support seeking.�
  2. MH-Copilot, an assistive framework for prompting users with missing support attributes in their post for better support seeking in peer community.

8 of 19

We propose REDDME, a manually annotated corpus of Reddit posts.

The following attributes are annotated

with spans (rationales), their intensity

levels and guided question as per

taxonomy.

  • Event
  • Effect
  • Requirement

Stats:

Total posts: 4760

Average Post Length: 179.62

Total Guided Questions: 7909

Dataset: REDDME

9 of 19

Taxonomy: CueTaxo

10 of 19

MH-COPILOT empowers support-seekers to tell their stories better.

Can we help users express what they need - before they give up asking?

11 of 19

  • Assess the post: extract Event, Effect, Requirement spans (CSpan), then rate each attribute’s intensity (absent / moderate / present).

  • Prompt the user: a generator produces guided questions targeted to missing/weak attributes, using a hierarchical taxonomy (CUETAXO).

  • Learn with RL: a verifier scores each question along multiple dimensions; scores feed a preference-based objective (DPO) to improve the policy.

MH-COPILOT: Assess → Prompt → Learn (RL)

12 of 19

POST

POST w/ Attribute Spans

Contextual Attribute Span Classifier

(CSpan)

Support Attribute Intensity Detection

(Intensity Classifier)

Attribute Level Intensity

TAXONOMY

Taxonomy-based Question Prompt

LM Layer (Dn)

LM Layer (Dn-1)

LM Layer (Dn-2)

Attribute Intensity

VERIFIER MODULE

Reference Model’s

Response Ranking

Reward Computation

DPO

What made you feel <X> ?

Can you elaborate more on <X> ?

What can help you overcome <X> ?

What made you feel anxious?

Can you elaborate more on how you feel?

What can help you overcome your anxiety?

Level 1

Level 2

Level 3

Level 4

Level 5

Structural

Assessor

Empathy

Assessor

Context Evaluator

Suggestive Question Generator

(Language Model)

  • CSpan - Support Attributes’ presence in the post is identified and highlighted as an NER task
  • Intensity Classifier - Presence of these attributes is rated to identify potential areas of improvement.

MH-COPILOT: Assess → Prompt → Learn (RL)

13 of 19

POST

POST w/ Attribute Spans

Contextual Attribute Span Classifier

(CSpan)

Support Attribute Intensity Detection

(Intensity Classifier)

Attribute Level Intensity

TAXONOMY

Taxonomy-based Question Prompt

LM Layer (Dn)

LM Layer (Dn-1)

LM Layer (Dn-2)

Attribute Intensity

VERIFIER MODULE

Reference Model’s

Response Ranking

Reward Computation

DPO

What made you feel <X> ?

Can you elaborate more on <X> ?

What can help you overcome <X> ?

What made you feel anxious?

Can you elaborate more on how you feel?

What can help you overcome your anxiety?

Level 1

Level 2

Level 3

Level 4

Level 5

Structural

Assessor

Empathy

Assessor

Context Evaluator

Suggestive Question Generator

(Language Model)

  • CUETAXO levels encode how complete each attribute is; we include these levels in the LM prompt so generation is attribute-aware.

MH-COPILOT: Assess → Prompt → Learn (RL)

14 of 19

POST

POST w/ Attribute Spans

Contextual Attribute Span Classifier

(CSpan)

Support Attribute Intensity Detection

(Intensity Classifier)

Attribute Level Intensity

TAXONOMY

Taxonomy-based Question Prompt

LM Layer (Dn)

LM Layer (Dn-1)

LM Layer (Dn-2)

Attribute Intensity

VERIFIER MODULE

Reference Model’s

Response Ranking

Reward Computation

DPO

What made you feel <X> ?

Can you elaborate more on <X> ?

What can help you overcome <X> ?

What made you feel anxious?

Can you elaborate more on how you feel?

What can help you overcome your anxiety?

Level 1

Level 2

Level 3

Level 4

Level 5

Structural

Assessor

Empathy

Assessor

Context Evaluator

Suggestive Question Generator

(Language Model)

MH-COPILOT: Assess → Prompt → Learn (RL)

  • The generator targets only absent/moderate attributes and keeps wording aligned to the taxonomy (e.g., “Can you describe more about the event…?”).
  • Output format is constrained (JSON schema with event_question, effect_question, requirement_question) to keep structure consistent.

15 of 19

POST

POST w/ Attribute Spans

Contextual Attribute Span Classifier

(CSpan)

Support Attribute Intensity Detection

(Intensity Classifier)

Attribute Level Intensity

TAXONOMY

Taxonomy-based Question Prompt

LM Layer (Dn)

LM Layer (Dn-1)

LM Layer (Dn-2)

Attribute Intensity

VERIFIER MODULE

Reference Model’s

Response Ranking

Reward Computation

DPO

What made you feel <X> ?

Can you elaborate more on <X> ?

What can help you overcome <X> ?

What made you feel anxious?

Can you elaborate more on how you feel?

What can help you overcome your anxiety?

Level 1

Level 2

Level 3

Level 4

Level 5

Structural

Assessor

Empathy

Assessor

Context Evaluator

Suggestive Question Generator

(Language Model)

MH-COPILOT: Assess → Prompt → Learn (RL)

  • Attribute Intensity - category correctness score
  • Context Evaluator - contextual grounding score
  • Empathy Assessor - empathy score
  • Structure Assessor - structure(taxonomy) assessor score

16 of 19

Results

17 of 19

Annotators reported MH-COPILOT’s outputs “occasionally surpass gold standard”

Metric

w/o Verifier

w/ Verifier

Empathy (D1)

3.27

3.43

Relevance (D2)

1.82

2.27

Context (D3)

2.19

3.31

Fluency (L3)

3.82

4.02

Human Eval

Verifier + Taxonomy → Quality Improvement Beyond Numbers

18 of 19

  • Reinforcement via preference learning (Verifier + DPO) produces qualitatively superior outputs.
  • Combining CUETAXO taxonomy + reward model yields large gains in alignment and clarity.
  • MH-COPILOT generalizes across LLMs (Gemma-2, Mistral, Phi-3, Llama-3).
  • Human evaluators confirmed the framework helps posts become clearer and more actionable for peers.

MH-COPILOT transforms generative RL from text optimization → social interaction enhancement.

Generative RL can teach models to ask better questions

19 of 19

Assess and Prompt: A Generative RL Framework for Improving Engagement in Online Mental Health Communities�Bhagesh Gaur, Karan Gupta, Aseem Srivastava, Manish Gupta, Md Shad Akhtar

Scan the QR code to access:

  • Paper
  • Code

Thank You