1 of 1

Reproducibility Summary

The loss function is a combination of standard classification loss & localization loss based on the attributions within the bounding box of the target object

Investigated Factors

"Studying How to Efficiently and Effectively Guide Models with Explanations" - A Reproducibility Study

Simpler algorithms like linear regression models are inherently interpretable due to their simple architecture while DNNs are difficult to interpret
May lead to unexpected learning behavior (e.g. learning spurious correlations of features rather than the features themselves)
Could result in poor generalization
Instead, implement guidance to make model "right for the right reasons"

References

Goran Oreski (2023): “YOLO*C - Adding context improves YOLO performance.” In: Neurocomputing, vol. 555, pp. 126655. doi: 10.1016/j.neucom.2023.126655.

Santosh K. Divvala et al. (2009). “An empirical study of context in object detection.” In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009, pp. 1271-1278.

Sukrut Rao et al. (2023). “Studying How to Effectively and Efficiently Guide Models with Explanations.” In: IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 1922-1933

Is this a real or a toy car?

What a about now?

Adrian Sauter,

Milan Miletić,�Ryan Ott,

Rohith Saai Pemm-�asani Prabakaran

EPG Score

EPG Score / √(Bounding Box Size)

Total Bounding Box Size in Pixels

Cropped Image

Correct Change Incorrect Change

Question

Original Image

Count

Percentage (%)

Percentage of Correct Identifications for Cropped vs. Original Images

Decision Changes after Observing Context

Type of Change

Results:�Evaluation on 100 COCO images reveals that EPG and SegEPG produce overly optimistic results, whereas X-SegEPG provides more realistic assessments

Rao et al. (2023): �SegEPG improves EPG: focus on attributions on segmentation mask instead of entire bounding box

Model attends to commonly co-occurring features (in this case: person holding the racket)

Similar trends but different magnitude

Energy loss inherently focuses on-object attention

Best results on EPG and F1 score observed with B-cos model and attribution method trained on Energy loss.

Motivation

Model Guidance

Reproducibility

Rao et al. (2023): Model should focus only on target object

Previous research: Context can enhance object classification and detection tasks (Oreski (2023), Divvala et al. (2009))

Our Hypothesis: In some scenarios, context is necessary

Extensions

Results: In certain scenarios, context plays a crucial role.

Our proposition:�X-SegEPG improves SegEPG: considers all �attributions, not only the ones that fall in bounding box

Qualitative Comparison

Quantitative Results

X-SegEPG

Survey: Impact of Context

Observation: EPG favors large bounding boxes

Our Proposition: Normalizing by bounding box size mitigates this issue

EPG vs Bounding Box Size

Attribution Methods

Model

Localization Loss

Classification Loss