1 of 46

Sports Data Analysis and Visualization

IEEE Vis 2022 Tutorial

Romain Vuillemot

2 of 46

Outline

02:00-02:15 Welcome, goals of this workshop

02:15-02:35 Motivation, examples

02:35-02:55 Tools, sports datasets

02:55-03:15 Code demo: event-based visualization I

~~~half-time~~~

03:45-04:00 Code demo: event-based visualization II

04:30-04:30 Code demo: tracking data visualization

04:30-05:00 Warp up and QA

3 of 46

Welcome 👋 My name is Romain Vuillemot I am Assistant Professor in Computer Science at Ecole Centrale de Lyon, France, with +10 years of experience working on sports data visualization. I teach general infovis class and run visualization workshops with coaches and sports federations.

2012-2015 (Opta events data) �Charles Perin, Romain Vuillemot, Jean Daniel Fekete. “SoccerStories: A Kick-off for Soccer Visual Analysis.” IEEE InfoVis 2013. ��

2016-2018 (SportsVU, Metrica data)�Gabin Rolland, Nathan Riviere, Romain Vuillemot, Wouter Bos “A Study of Space and Time-Dependence of 3-Point Shot Efficiency in Basketball” Accepted MIT Sloan Sports Analytics

��2019-2024 (Data using Computer Vision)

4 of 46

Goal of this tutorial 🎯 The general goal of this tutorial is to be able to get into sports data visualization and do simple things symply.

Simple, yet recent examples of sports visualization �Sports will be covered in general with a focus on teams sport with a spatial analysis approach (ie events with locations).

A bit of design space and challenges that are usually encountered when working with sports data. However we will not cover the full design space of sports visualization.

Understand all the technical steps such as collecting data, what i needed for the analysis but also the metadata needed to communicate the visualization efficiently.

Build your first sports visualization we will provide data and step-by-step process to do so with examples of design. It is a starting points to adapt to a different sport or design.

Not covered: design process, sports-specific statistics, historical data/prediction visualizations.

5 of 46

For research in sports data visualizations

“Sport Vis session” Thursday, October 20, 2022 (10:45 AM-12:00 PM)

Charles Perin, Romain Vuillemot, Charles D Stolper, John T Stasko, Jo Wood, et al.. State of the Art of Sports Data Visualization. Computer Graphics Forum, Wiley, 2018, 37 (3), pp.1-24. https://hal.archives-ouvertes.fr/hal-01806107

IEEE VIS Anthology Vis by Hendrik Strobelt and Benjamin Hoover https://visav.vizhub.ai/

“Sports” papers

6 of 46

Outline

02:00-02:15 Welcome, goals of this workshop

02:15-02:35 Motivation, examples

02:35-02:55 Tools, sports datasets

02:55-03:15 Code demo: event-based visualization I

~~~half-time~~~

03:45-04:00 Code demo: event-based visualization II

04:30-04:30 Code demo: tracking data visualization

04:30-05:00 Warp up and QA

7 of 46

What is sports, visualization and analysis?

Sports is an activity involving physical skills in which an individual or team competes against another or others for entertainment”

Sports data should reflect such activity along with:

  • Spatial constraint (pich, route, etc)
  • Rules that regulate the plays, use of ball and accessories
  • Schedules (games, tournaments)

Events

Actions that occur during games (eg pass, goal)

Tracking data

Continuous positions of players and balls

Metadata

Description of the context of the game

Charles Perin, Romain Vuillemot, Charles D Stolper, John T Stasko, Jo Wood, et al.. State of the Art of Sports Data Visualization. Computer Graphics Forum, Wiley, 2018, 37 (3), pp.1-24. https://hal.archives-ouvertes.fr/hal-01806107

8 of 46

Why get into sports visualization?

Work on an enjoyable topicSport usually is an enjoyable experience: you have a favorite player, team; you may practice the sport, have recorded personal analytics.

Start discussions in a more objective way�Clubs or coaches need better analytics on games, for tactical or recruitment purpose.

Address difficult visualization challenges�This is why there are so many research papers in VIS and other CS conferences. Many interesting contributions recently involving Computer Vision.

Visibility of the Vis work to non-experts�Most people are familiar with sport practice or watching experience. They may relate to the work more easily.

Career in the sports analytics industry�There is a growing number of openings in the industry (et startups) or sports analytics department in clubs.

9 of 46

Why is it (still) challenging?

Diversity of sports �There exists +200 sports, with their own rules, balls, accessories. They also are locally distributed and with different levels of practice.��Strong spatial domain �Space is needed to provide full context of the game.. Landmarks are also crucial for the visual representation. Also video recordings may provide full context (if available).�Sharing is difficultLicences and copyright especially for videos). Sports is also a competitive activity so sharing data, tools and insights to opponents may not enable to increase a team’s results.

Data..�They are particularly messy and inconsistent. They are difficult to collect at high quality and in a uniform way (ie using a similar data model)

10 of 46

Examples of data domains

Charles Perin, Romain Vuillemot, Charles D Stolper, John T Stasko, Jo Wood, et al.. State of the Art of Sports Data Visualization. Computer Graphics Forum, Wiley, 2018, 37 (3), pp.1-24. https://hal.archives-ouvertes.fr/hal-01806107

Wu, Yingcai, et al. "ittvis: Interactive visualization of table tennis data." IEEE transactions on visualization and computer graphics 24.1 (2017): 709-718.

11 of 46

Examples of spatial domains

⚠️ normalization is needed

12 of 46

Examples of spatial domains

13 of 46

Examples of sports visualizations: shot maps

Kirk Goldsberry, “CourtVision: New Visual and Spatial Analytics for the NBA” MIT Sloan Conference 2012

14 of 46

Examples of sports visualizations: pitch control maps

Javier Fernandez, Luke Bornn. Wide Open Spaces: A statistical technique for measuring space creation in professional soccer. MIT Sloan Conference 2018

15 of 46

Examples of sports visualizations: density maps

Federer’s frequency of shots passing through a given point on the courthttps://gamesetmap.com/?m=201302

16 of 46

Examples of sports visualizations: tactical maps

Analysis and takeaways from the recent Liverpool vs Manchester city game.�https://rattibha.com/thread/1446509537989562374

17 of 46

Sports visualization design space Organized primarily from events to movement and then progressively to specific encodings. The focus will be on space-related visualizations

Positions�+Events

Events

Positions

(x, y, t)

(type_event)

(x, y, t, type_event)

(x_video, y_video)

18 of 46

Sports visualization design space Organized primarily from events to movement and then progressively to specific encodings. The focus will be on space-related visualizations

Positions�+Events

Events

Positions

(x, y, t)

(type_event)

(x, y, t, type_event)

(x_video, y_video)

19 of 46

Sports visualization design space Organized primarily from events to movement and then progressively to specific encodings. The focus will be on space-related visualizations

All

Binning

Derive

Connect

Animate

20 of 46

Sports visualization design space Organized primarily from events to movement and then progressively to specific encodings. The focus will be on space-related visualizations

All

Binning

Derive

Connect

Animate

Events

Tracking�+Events

Tracking data

21 of 46

Outline

02:00-02:15 Welcome, goals of this workshop

02:15-02:35 Motivation, examples

02:35-02:55 Tools, sports datasets

02:55-03:15 Code demo: event-based visualization I

~~~half-time~~~

03:45-04:00 Code demo: event-based visualization II

04:30-04:30 Code demo: tracking data visualization

04:30-05:00 Warp up and QA

22 of 46

Tools

Python modules

JavaScript libraries

Tableau Software

R

  • Commercial products

23 of 46

How to get sports data?

Paper box score cards�So anyone can print and report statistics from games. Usually with gross regions information and needs time to write and digitalize.

24 of 46

How to get sports data?

<annotation>

<source>

<database>Unknown</database>

</source>

<size>

<width>2006</width>

<height>1504</height>

<depth>3</depth>

</size>

<segmented>0</segmented>

<object>

<name>AAA</name>

<pose>Unspecified</pose>

<truncated>0</truncated>

<difficult>0</difficult>

<bndbox>

<xmin>185</xmin>

<ymin>600</ymin>

<xmax>695</xmax>

<ymax>978</ymax>

</bndbox>

</object>

</annotation>

Annotation tools�Enable to generate your own dataset based on observations. Tools such as LabelImg provide primitives to annotate in 2D or 3D and assign labels. It outputs XML/Json formats.

Labelimg�https://github.com/heartexlabs/labelImg

25 of 46

How to get sports data?

Automated tracking system�Systems that include a pipeline of transformations from videos to players positions, using Computer Vision and/or Deep Learning. Very well suited for tracking data (if position on the field has been calculated along the way).

26 of 46

SportsVu Basketball dataset

Players and ball trajectories for 631 games from the 2015-2016 NBA season�https://github.com/sealneaward/nba-movement-data

27 of 46

LastRow data sample

28 of 46

StatsBomb data samples

29 of 46

Outline

02:00-02:15 Welcome, goals of this workshop

02:15-02:35 Motivation, examples

02:35-02:55 Tools, sports datasets

02:55-03:15 Code demo: event-based visualization I

~~~half-time~~~

03:45-04:00 Code demo: event-based visualization II

04:30-04:30 Code demo: tracking data visualization

04:30-05:00 Warp up and QA

30 of 46

Code demo A Step-by-step process using Observable Notebooks

https://observablehq.com/d/2d8e4e58280353bf

Part 1: Events data (on the ball): everything that happens, but not the positions of all the players

Shots and goals during the 2019 Female world cup semi-final between England and Sweden

Part 2: Position data (of players): show trajectory, animation and (simple) space occupation model

Second goal from Mané during Liverpool against Newcastle the 14th of september of 2019

31 of 46

Code demo: Events chart All shots or types of particular type across a season or during a particular time interval.

Case study: 2019 Female world cup semi-final between England and Sweden (3rd place play)

Questions:

  • What is the overall distribution of events?
  • Who scored and from where?

��

https://www.youtube.com/watch?v=-Kkd7x-8VuI

32 of 46

Code demo: Events chart All shots or types of particular type across a season or during a particular time interval.

Step 1: look at the data

Step 2: define the spatial reference

Step 3: plot and filter by event/time

All sorts of events

(kickoff, ..)

33 of 46

Code demo: Events chart (with binning) �Binning enables to aggregate data within a particular region to reveal trends �(but also because data may not be highly accurate)

Step 4: define a binning function

Step 5: filter by events

34 of 46

Outline

02:00-02:15 Welcome, goals of this workshop

02:15-02:35 Motivation, examples

02:35-02:55 Tools, sports datasets

02:55-03:15 Code demo: event-based visualization I

~~~half-time~~~

03:45-04:00 Code demo: event-based visualization II

04:30-04:30 Code demo: tracking data visualization

04:30-05:00 Warp up and QA

35 of 46

Code demo: Events chart (small multiples)Another approach to separate data (players, time, teams) is to use display divisions

Step 5: small multiples according to an attribute

⚠️ You need to preserve the pitch aspect ratio

36 of 46

Outline

02:00-02:15 Welcome, goals of this workshop

02:15-02:35 Motivation, examples

02:35-02:55 Tools, sports datasets

02:55-03:15 Code demo: event-based visualization I

~~~half-time~~~

03:45-04:00 Code demo: event-based visualization II

04:30-04:30 Code demo: tracking data visualization

04:30-05:00 Warp up and QA

37 of 46

Code demo: positional charts

Why did Firmino pass the ball to Mané and not to Salah?

(second goal from Mané during Liverpool against Newcastle the 14th of september of 2019)

38 of 46

Code demo: positional charts

Why did Firmino pass the ball to Mané and not to Salah?

39 of 46

Code demo: positional charts. From tracking data of players positions on a field for a give time (or frame rate)

Step 1: look at the data

Step 2: define the spatial reference

Step 3: animate

Step 4: include simple occupation model

40 of 46

Go further. Advanced occupation models

Javier Fernandez, Luke Bornn. Wide Open Spaces: A statistical technique for measuring space creation in professional soccer. MIT Sloan Conference 2018

41 of 46

Go further. Include perspective to situate the visualizations in a picture/video

42 of 46

Outline

02:00-02:15 Welcome and goal of this workshop

02:15-02:35 Motivation, examples

02:35-02:55 Tools, sports datasets

02:55-03:15 Code demo: event-based visualization

~~~half-time~~~

03:45-04:15 Code demo: chain-based and tracking data visualization

04:15-04:30 Code demo: draw with image perspective

04:30-05:00 Warp up and QA

43 of 46

Future data displays opportunities.

44 of 46

Future data acquisition opportunities.

45 of 46

Sports venues and journals.

workshops/conferences

computer vision/machine learning

journals

journal of quantitative analytics

46 of 46

Thank you!