1 of 1

Reproducibility Summary

The loss function is a combination of standard classification loss & localization loss based on the attributions within the bounding box of the target object

Investigated Factors

"Studying How to Efficiently and Effectively Guide Models with Explanations" - A Reproducibility Study

  • Simpler algorithms like linear regression models are inherently interpretable due to their simple architecture while DNNs are difficult to interpret
  • May lead to unexpected learning behavior (e.g. learning spurious correlations of features rather than the features themselves)
  • Could result in poor generalization
  • Instead, implement guidance to make model "right for the right reasons"

References

Goran Oreski (2023): “YOLO*C - Adding context improves YOLO performance.” In: Neurocomputing, vol. 555, pp. 126655. doi: 10.1016/j.neucom.2023.126655.

Santosh K. Divvala et al. (2009). “An empirical study of context in object detection.” In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009, pp. 1271-1278.

Sukrut Rao et al. (2023). “Studying How to Effectively and Efficiently Guide Models with Explanations.” In: IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 1922-1933

Is this a real or a toy car?

What a about now?

Adrian Sauter,

Milan Miletić,�Ryan Ott,

Rohith Saai Pemm-�asani Prabakaran

EPG Score

EPG Score / √(Bounding Box Size)

Total Bounding Box Size in Pixels

Total Bounding Box Size in Pixels

Cropped Image

Correct Change Incorrect Change

Question

Original Image

Count

Percentage (%)

Percentage of Correct Identifications for Cropped vs. Original Images

Decision Changes after Observing Context

Type of Change

Results:�Evaluation on 100 COCO images reveals that EPG and SegEPG produce overly optimistic results, whereas X-SegEPG provides more realistic assessments

Rao et al. (2023): SegEPG improves EPG: focus on attributions on segmentation mask instead of entire bounding box

Model attends to commonly co-occurring features (in this case: person holding the racket)

Similar trends but different magnitude

Energy loss inherently focuses on-object attention

Best results on EPG and F1 score observed with B-cos model and attribution method trained on Energy loss.

Motivation

Model Guidance

Reproducibility

  • Rao et al. (2023): Model should focus only on target object

  • Previous research: Context can enhance object classification and detection tasks (Oreski (2023), Divvala et al. (2009))

  • Our Hypothesis: In some scenarios, context is necessary

Extensions

Results: In certain scenarios, context plays a crucial role.

Our proposition:�X-SegEPG improves SegEPG: considers all �attributions, not only the ones that fall in bounding box

Qualitative Comparison

Quantitative Results

X-SegEPG

Survey: Impact of Context

Observation: EPG favors large bounding boxes

Our Proposition: Normalizing by bounding box size mitigates this issue

EPG vs Bounding Box Size

Attribution Methods

Model

Localization Loss

Classification Loss