On the Calibration of Deep Learning Models �to Improve Trustworthy AI
Computer Science
Cornelia Caragea
Human Confidence and Calibration
2
What movie won the Best Picture at Oscars 2023?
IRg
Information Retrieval Group
UIC Computer Science
Human Confidence and Calibration
3
Who is Prime Minister in UK?
IRg
Information Retrieval Group
UIC Computer Science
Machines…
4
Do they know what they don’t know?
Or in other words… are they calibrated?
IRg
Information Retrieval Group
UIC Computer Science
Deep Neural Networks
5
IRg
Information Retrieval Group
UIC Computer Science
DNNs Confidence and Calibration
6
Credit for the plots: Thulasidasan et al. [2019].
Accuracy vs confidence on CIFAR-100 at different training epochs for VGG-16 neural net.
IRg
Information Retrieval Group
UIC Computer Science
Calibration in Pre-trained Language Models
7
IRg
Information Retrieval Group
UIC Computer Science
Over-confidence
8
IRg
Information Retrieval Group
UIC Computer Science
1.00
0.00
0.00
Calibration Techniques
9
IRg
Information Retrieval Group
UIC Computer Science
MixUp
10
IRg
Information Retrieval Group
UIC Computer Science
11
On the Calibration of Pre-trained Language Models using MixUp Guided by Area Under the Margin and Saliency
IRg
Information Retrieval Group
UIC Computer Science
[Park and Caragea, ACL 2022; NAACL 2022]
[Hosseini and Caragea, ACL-Finding 2023; EMNLP 2022]
Proposed MixUp for Model Calibration
12
IRg
Information Retrieval Group
UIC Computer Science
Mixup using Saliency Signals
13
IRg
Information Retrieval Group
UIC Computer Science
Datasets
14
IRg
Information Retrieval Group
UIC Computer Science
In-domain Data Results on BERT
15
Our proposed MixUp results in best ECE values for all ID tasks
(similar results are observed on RoBERTa).
IRg
Information Retrieval Group
UIC Computer Science
Out-of-domain Data Results on BERT
16
Our proposed MixUp results in best ECE values for all OOD tasks
(similar results are observed on RoBERTa).
IRg
Information Retrieval Group
UIC Computer Science
LLMs Confidence and Calibration
17
IRg
Information Retrieval Group
UIC Computer Science
LLMs Confidence and Calibration
18
[1] Sadat and Caragea, 2022: SciNLI: A Corpus for Natural Language Inference on Scientific Text.
IRg
Information Retrieval Group
UIC Computer Science
[2] Sadat and Caragea, 2024: MSciNLI: A Diverse Benchmark for Scientific Natural Language Inference
.
LLMs Confidence and Calibration
19
IRg
Information Retrieval Group
UIC Computer Science
20
LLMs Confidence and Calibration
Confidence Elicitation
21
LLM Results
22
IRg
Information Retrieval Group
UIC Computer Science
Conclusion
23
IRg
Information Retrieval Group
UIC Computer Science
Thank you!
24
DPI
Seo Yeon Park
Mobashir Sadat
Tiberiu Sosea
Mahshid Hosseini
IRg
Information Retrieval Group
UIC Computer Science
Anas Jawad