Understanding How Deaf and Hard of Hearing Viewers Visually Explore Captioned Live TV News
Akhter Al Amin, Saad Hassan, Sooyeon Lee, Matt Huenerfauth
W4A’23: 20th International Web for All Conference, April 2023, Austin, Texas
2
Image Credit: Proxima Studio
3
Image Credit: Sylvain Pedneault and Mike Liao
Caption Occlusion
4
Prior Work on Caption Placement and Occlusion
5
SIGDOC '13
UAHCI ‘21
W4A ‘21
CHI ‘22
Eye-tracking Study with 19 DHH Participants
6
16 D/deaf and 3 hard of hearing
Watch captioned TV news ~3hrs/week
Do you identify as Deaf or Hard of Hearing? AND
Do you use captioning when viewing videos or television?
27.33 years (SD=6.46)
Stimuli Preparation and Annotation
Stimuli Video Preparation
Area of Interest Annotation
7
8
Tobii Pro Nano remote eye tracker
65cm
Notes Taking
Secondary screen with live gaze tracking
RQ1: Gaze Behavior vs. Subjective Numeric Ratings
9
Mean Proportional Fixation Time
Subjective Numeric Ratings
RQ2: Gaze Behavior Over Time For Different Regions
10
RQ2: Gaze Behavior Over Time For Different Regions
11
Group 1
Group 2
Group 3
Group 4
Gaze Behavior Over Time For Different Information Regions
12
Group 1: Peak Followed by Slowly Decreasing Sustained Attention
Factors Explaining Variation in their Attention Over Time
13
Group 1: Peak Followed by Slowly Decreasing Sustained Attention
✓ High Attention Priority
✓ Initial Visual Scan
✓ Provided Context
Factors Explaining Variation in their Attention Over Time
14
“The information on the bottom, the discussion topic, and the running headlines should be visible at any time. I want to be able to read those things and have those things not be blocked. It is fine if some of the information is blocked for a few seconds." - P12
What do we recommend?
15
During the first few seconds of a news video story, it is especially important that over-the-shoulder text, discussion topic, and scrolling news should not be blocked. Later, it is also better to avoid blocking these high-priority information regions, but not at the expense of blocking any dynamic information regions.
Gaze Behavior Over Time For Different Information Regions
16
Group 2: Sustained Attention
Factors Explaining Variation in their Attention Over Time
17
Group 2: Sustained Attention
✓ Human Faces
✓ Dynamic Information
✓ Identification of Speaker
✓ Provide Context
Factors Explaining Variation in their Attention Over Time
18
“The person’s mouth, facial expression, and sometimes body language [are important]. You can really get a lot of information from body language and facial expressions about the context of the video.” - P15
What do we recommend?
19
Speaker’s face, Listener’s face, and Over-the-shoulder text should not be blocked during a news video because they receive continuous attention. We did not find additional priority for these regions during the first few seconds of the news story.
Gaze Behavior Over Time For Different Information Regions
20
Group 3: Low Attention with Some Peaks
Factors Explaining Variation in their Attention Over Time
21
Group 3: Low Attention with Some Peaks
✓ Understanding Source
✓ Static Text
What do we recommend?
22
It could be OK to block Speaker’s Information and Program Title, as long as there were some short gaps in-between caption blocks when a viewer could briefly see them.
Gaze Behavior Over Time For Different Information Regions
23
Group 4: Very Low Attention
Factors Explaining Variation in their Attention Over Time
24
Group 4: Very Low Attention
✓ Unrelated to News Story
✓ Brief Attention Required
Factors Explaining Variation in their Attention Over Time
25
“for the most part, there is some information that is more important than others. Like the weather… temperature isn’t as important as long as the other discussion topics and news are still able to be seen.” - P13
What do we recommend?
26
Not blocking Logo, Time, and Temperature is always best, but if necessary, it should not be problematic to block these regions. Brief durations of time in-between caption blocks when these regions are visible may be enough for DHH viewers to read them.
How do our findings captioning regulatory agencies?
27
28
Dr. Matt Huenerfauth
Dr. Akhter Al Amin
Dr. Sooyeon Lee
Max Shengelia
Saad Hassan
Acknowledgements
And there is more…
Velvet Howland
Recruiting 1-2 PhD Students at Tulane University
Contact Information: saadh.info
Design of robust and flexible human-AI systems to provide access to audio and visual information
Socio-technical challenges related to algorithmic discrimination and transparency in AI systems
Community experiences and perceptions of AI systems used in healthcare (co-advisees)
29