PARTNER: A Persuasive Mental Health and Legal Counselling
Dialogue System for Crime Victims
#AI4SG5879
Priyanshu Priya*, Kshitij Mishra*, Palak Totala, Asif Ekbal
AI-NLP-ML Lab, Department of Computer Science and Engineering, Indian Institute of Technology Patna, Patna, India
Code
References
Generation of entity-centric information-seeking questions from videos. Our work addresses three key challenges: identifying question-worthy information, linking it to entities, and effectively utilizing multimodal signals
Problem Statement
VideoQuestions Dataset Annotation and Sample Dialogue
Results and Analysis
Conclusions and Future Direction
Proposed System (ECIS-VQG)
32nd International Joint Conference on Artificial Intelligence
Motivation
Contribution
VideoQuestions Dataset
Models | BLEU-1 | CIDEr | METEOR | BERT-Score | ROUGE-L |
Llama3-8B | 19.8 | 2.33 | 45.5 | 68 | 38.1 |
Qwen-VL | 2.7 | 0.26 | 31.8 | 56.8 | 17.2 |
GPT-4o | 7.1 | 0.87 | 41.6 | 64.7 | 25.6 |
Ushio et al. | 6.4 | 0.779 | 28.1 | 59.3 | 21.8 |
Proposed Model | 71.3 | 7.311 | 81.9 | 90.0 | 78.6 |
Observations:
Contact
VideoQuestions Dataset Statistics
Category | Number of Videos | Average Video Length | Number of Empty Video Transcripts |
Education | 121 | 6.34 | 9 |
Entertainment | 32 | 6.07 | 2 |
How to & Style | 90 | 5.76 | 2 |
News & Politics | 8 | 5.76 | 2 |
People & Blogs | 75 | 6.18 | 6 |
Science & Technology | 65 | 5.38 | 1 |
Travel & Events | 20 | 6.92 | 0 |
There are a very few studies on video based QG
Video understanding based questions
ECIS-VQG: Generation of Entity-centric Information-seeking Questions from Videos
Architecture of the proposed method indicating various components like input representations, chapter titles classifier, and Transformer encoder-decoder model. Here, inputs are shown in orange, outputs are in green, models are in blue, and loss functions are in pink. Note that loss computation happens at train time only. Prompt is used for Alpaca only. Cross-attention Transformer layer and video embedding is not used for Alpaca
Note: This work is accepted at EMNLP 2024
Two examples of ECIS QG task. For example1, although the existing QG model (Romero, 2021) generates a grammatically sound question, it lacks key context information like a place (Where is the food cheap?) or subject (Which food item?). In example-2, without the particular chair’s name, the question generated by the existing QG model is too broad
Human Evaluation
Observations:
Arpan Phukan1, Manish Gupta2, Asif Ekbal1*
AI-NLP-ML Lab, Department of Computer Science and Engineering, Indian Institute of Technology Patna, Patna, India1
Microsoft2
Indian Institute of Technology Jodhpur (on lien from IIT Patna)*