1
Applied Data Analysis (CS401)
Maria Brbić
Lecture 3
Visualizing data
24 Sep 2025
Announcements
2
Feedback
3
Give us feedback on this lecture here: https://go.epfl.ch/ada2025-lec3-feedback
Uses for data visualization
Analysis: Support reasoning about information
Communication: Inform and persuade others
Decision making: Make it easier to evaluate potential courses of action
4
5
Data visualizations and plots
Basic stats
An unconventional example
6
“Garden of Eden”: 8 lettuces, each of which is enclosed in its own airtight plexiglas box and represents a major city. The concentration of ozone in each box is controlled in real-time to reflect the current pollution level in the city.
Static viz
Great for data exploration, developed throughout the last few centuries…
Interactive viz
More and more common when delivering the results (and also during exploration). New frameworks are the key enabler.
7
Want to learn more?
Dedicated course:
8
Today’s lecture
9
Part 1
Navigating the chart landscape
10
Chart selection
11
One variable: histograms
Histograms can tell you a lot about a single variable, discrete or continuous
Easy to recognize skewed distributions!
Smoothed histogram (a.k.a. kernel density estimate)
12
One variable: box plots
Two variables: scatter plots
Scatter plots quickly expose the relationships between two variables
2D histograms
a.k.a. heatmap
14
Two variables: line plots
If relationship is functional (for instance, after binning and aggregating)
15
> 2 variables: scatter plot matrix
16
> 2 variables: stacked plots
Here: 3 variables: stack index, height, color
17
Stack variable and color variables categorical,
height variable continuous:
Color variable categorical,
stack and height variables continuous:
Dimensionality reduction
18
One dataset, visualized 25 ways
http://flowingdata.com/2017/01/24/one-dataset-visualized-25-ways
“You must help the data focus and get to the point. Otherwise, it just ends up rambling about what it had for breakfast this morning and how the coffee wasn’t hot enough.”
19
Part 2
Principles and best practices
20
Instructive coffee table books by Edward Tufte
21
Perception of magnitudes
22
Which is brighter?
(134, 134, 134)
(144, 144, 144)
Just noticeable difference (JND)
23
24
Compare area of circles
25
Compare area of circles
Perception of magnitudes
Most accurate Position
Length
Slope
Angle
Area
Volume
Least accurate Color hue-saturation-density
26
Cleveland & McGill (1984)
Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods
Choose your axes wisely!
Time series of pageviews
of Wikipedia article about “Coronavirus”
(linear y-axis)
(logarithmic y-axis)
27
[link]
Choose your axes wisely:
Visualizing heavy-tailed distributions
28
Linear x-axis
Logarithmic x-axis
Heavy-tailed data: power laws
30
Commercial break
Heavy-tailed data: power laws
(with binned x-axis)
POLLING TIME
P(x) := Pr{X >= x}
Heavy-tailed data: power laws
CCDF
(with binned x-axis)
Answer fast: which time series has a higher mean value?
33
Answer fast: which time series has a higher mean value?
Use consistent axes!
34
Label your axes!
35
36
THINK FOR A MINUTE:
How could we show details of�both time series without using
different y-axes?
(Feel free to discuss with your neighbor.)
logarithmic!
37
Which beer is more popular, Guinness or Paulaner?
Show data uncertainty!
“error bars”
38
Consider using small multiples!
Use colors consistently!
[link]
Media attention
Use colors wisely!
Choose colors based on the information you want to convey
40
41
Use colorblind-safe palettes!
42
Use data ink wisely! Avoid chart junk!
43
Use visual contrast
44
The good, the bad and the ugly
45
Which principles and best practices do these graphics violate?
Courtesy of viz.wtf
46
Part 3
A (small) selection of use cases
for data visualization
47
48
Use case:
Presenting scientific results
Multimodal data
49
Use case:
Data wrangling
Multimodal data
Explore further by using, e.g., color and a histogram of multiple populations
50
Use case:
Data wrangling
Weird data
51
Use case:
Data wrangling
[link]
NY Times interactive visualizations (recession/recovery 2014)�http://www.nytimes.com/interactive/2014/06/05/upshot/how-the-recession-reshaped-the-economy-in-255-charts.html
And 2014 “the year in interactive storytelling”
http://www.nytimes.com/interactive/2014/12/29/us/year-in-interactive-storytelling.html?_r=0
NY Times graphics are a great source of�best practices in viz (except for when they’re not…)
52
Use case:
Journalism
53
Use case:
Educating the public
Charles Joseph Minard 1869�Napoleon’s march
54
According to Tufte: “It may well be the best statistical graphic ever drawn.”
5 variables: army size, location, dates, direction, temperature during retreat
Use case:
Give new perspectives
Tools
(remaining slides for your personal perusal)
55
Interactive toolkits: D3
Without doubt, the most widely used interactive visualization framework is D3.
Note from the authors: D3 is intentionally a low-level system. During the early design of D3, we even referred to it as a "visualization kernel" rather than a "toolkit" or "framework"
56
Interactive toolkits: Vega
Vega is a “visualization grammar” developed on top of D3.js
It specifies graphics in JSON format.
57
Interactive toolkits: Vincent
Vincent is a Python-to-Vega translator.
Trivia question: why is it called Vincent? Hint: Vincent+Vega= ?
58
Interactive toolkits: Vincent
Vincent is a Python-to-Vega translator.
Trivia question: why is it called Vincent? Hint: Vincent+Vega= ?
59
Bokeh: another interactive viz library
Bokeh is an independent Viz library focused more heavily on big data visualization. Has both Python and Scala bindings.
60
Visualizing maps: Folium
More in tomorrow’s lab session!
61
Feedback
62
Give us feedback on this lecture here: https://go.epfl.ch/ada2025-lec3-feedback
> 2 variables: parallel-coord. plots
Color, x, y
Color variable is categorical, others arbitrary
63
> 2 variables: radar charts
64
Heavy-tailed data: power laws
CCDF
Interactive chart design: simplifying
66
Use structure!
Gestalt psychology principles (1912)
67
A case for ugly visualizations
People instinctively gravitate to attractive visualizations, and they have a better chance of getting on the cover of a journal.
But does this conflict with the goals of visualization?
68
Guide your audience!
17th March�(St. Patrick’s day)
69