Week 5: Visual Encoding
Introduction to Data Visualization
W4995.003 Spring 2025
00 Quiz
01 Sparks
02 Design/Redesign Exercise Reprise
03 Readings
04 Mark & Channels
05 Expressiveness & Effectiveness
Quiz
5 min
Closed book
02
Redesigns from Two Weeks Ago
Analyze and Re-design #1: California Wildfires
BuzzFeed Peter Aldhous
Analyze and Re-design #2: Basketball
Flowing Data Nathan Yau
Analyze and Re-design #3: Global Middle Class
Washington Post
Analyze and Re-design #4: U.S. Total Tax Rate
NYtimes Opinion
Analyze and Re-design #5: American Job Incomes
Nathan Yau
03
Readings
Visible change: +783%
Raw: +9.5 mpg
Proportional: +52%
Size of effect in data
Size of effect in graphic
18
27.5
Visible change: +52%
Visible change: +783%
Raw: +9.5 mpg
Proportional: +52%
Size of effect in data
Size of effect in graphic
Visual language: Nouns
Visual language: Adjectives
Visual language: Adjectives
The Creative Process™
Color: A brief history
4535 Time Magazine Covers, 1923-2009
The Top Grossing Film of All Time, 1 × 1
by Jason Salavon (2000)
04
Marks & Channels
Data Types
Visual Encoding
Perceptual Properties
Marks & Channels
Today!
Lecture 7
Marks
Basic graphical elements in image
(a.k.a. forms & geometry)
Graphics from Munzner. Visualization Analysis and Design (2015)
Point
Line
Area
Marks
Basic graphical elements in image
(a.k.a. forms & geometry)
Graphics from Munzner. Visualization Analysis and Design (2015)
Point
Line
Area
0D
1D
2D
Channels
Ways to vary the appearance of marks
Graphics from Munzner. Visualization Analysis and Design (2015)
Tilt
Color
Shape
Position
Size
Example
Munzner. Visualization Analysis and Design (2015)
Q
Q
Mark: point
Q
Q
Mark: point
Channels:
Color: C
Q
Q
Mark: point
Channels:
Color: C
Size: Q
?
?
Mark: ?
Example
Munzner. Visualization Analysis and Design (2015)
Q
C
Mark: line
Q
Q
Mark: point
Q
Q
Mark: point
Channels:
Color: C
Q
Q
Mark: point
Channels:
Color: C
Size: Q
Example
Munzner. Visualization Analysis and Design (2015)
Q
C
Mark: line
?
?
Mark: ?
Q
Q
Mark: point
Channels:
Color: C
Q
Q
Mark: point
Channels:
Color: C
Size: Q
Example
Munzner. Visualization Analysis and Design (2015)
Q
C
Mark: line
Q
Q
Mark: point
Q
Q
Mark: point
Channels:
Color: C
Q
Q
Mark: point
Channels:
Color: C
Size: Q
Example
Munzner. Visualization Analysis and Design (2015)
Q
C
Mark: line
Q
Q
Mark: point
?
?
Mark: ?
Channels: ?
Q
Q
Mark: point
Channels:
Color: C
Size: Q
Example
Munzner. Visualization Analysis and Design (2015)
Q
C
Mark: line
Q
Q
Mark: point
Q
Q
Mark: point
Channels:
Color: C
Q
Q
Mark: point
Channels:
Color: C
Size: Q
Example
Munzner. Visualization Analysis and Design (2015)
Q
C
Mark: line
Q
Q
Mark: point
Q
Q
Mark: point
Channels:
Color: C
Q
Q
Mark: point
Channels: ?
Example
Munzner. Visualization Analysis and Design (2015)
Q
C
Mark: line
Q
Q
Mark: point
Q
Q
Mark: point
Channels:
Color: C
Q
Q
Mark: point
Channels:
Color: C
Size: Q
Note: Area as Mark vs. Area as Channel
# House Representatives
# House Representatives
State GDP
Area marks should not be size- or shape-encoded
“States” as marks (by shape) already have an implied and accepted size, so it’s difficult to tell if they’re being enlarged or shrunk (cf. dots).
# House Representatives
# House Representatives
State GDP
Area marks should not be size- or shape-encoded
Be wary of scaled pictograms for similar reasons.
Lie factor:
Are we comparing height or area here?
Area marks should not be size- or shape-encoded
Tufte: Don’t use two visual dimensions to represent a single data dimension
Treemap: Area already used as a channel (mkt. cap), so size/shape cannot be additionally encoded
Exception: value-by-area maps
“Country” as a mark has an accepted size, especially in a map context.
Encoding size to another value highlights the difference from your expectation.
Via NYT 2013
Mark Area
Channels
Country ~ Shape
GDP (Q) ~ Size
GDP Growth(O) ~ Color
Center Longitude ~ Pos. X
Center Latitude ~ Pos. Y
Via NYT 2013
Deconstruct: Ebb and Flow of... Box Office Receipts
Via NYT 2008
Ebb and Flow of... Box Office Receipts
Mark Area
Channels
Time (Week) ~ Position X
Weekly Revenue ~ Height/Length along Y
Gross Box Office ~ Color
Gross Box Office ~ Area
Visual Encoding = Mapping data to visual variables
Assign data fields (Q, O, C) to visual channels (x, y, color, size, etc.) for a graphical mark (point, bar, line, etc.)
But the combinatorial space is so large, how do you choose?
Visual Encoding = Mapping data to visual variables
Assign data fields (Q, O, C) to visual channels (x, y, color, size, etc.) for a graphical mark (point, bar, line, etc.)
…
To maximize expressiveness and effectiveness.
05
Expressiveness & Effectiveness
Expressiveness (MacKinlay 1986)
A set of facts is expressible in a visual language if
the sentences (i.e. the visualizations) in the
language express all the facts in the set of data,
and only the facts in the data.
Example: Iris database
Fails to express all the facts
MacKinlay 1986 Automating the Design of Graphical Presentations of Relational Information
Expresses facts not in the data
MacKinlay 1986 Automating the Design of Graphical Presentations of Relational Information
Expresses facts not in the data
MacKinlay 1986 Automating the Design of Graphical Presentations of Relational Information
Expresses facts not in the data
MacKinlay 1986 Automating the Design of Graphical Presentations of Relational Information
Expresses facts not in the data
MacKinlay 1986 Automating the Design of Graphical Presentations of Relational Information
Expresses facts not in the data
MacKinlay 1986 Automating the Design of Graphical Presentations of Relational Information
Effectiveness (MacKinlay 1986)
One visualization is more effective than another
if the information conveyed is more readily perceived than the information in the other visualization.
In other words:
Expressiveness
Tell the truth, the whole truth, and nothing but the truth (i.e., don’t lie, and don’t lie by omission)
Effectiveness
Use encodings that people can decode more
quickly, accurately, and easily
Via Jeffery Heer
Remember this from Week 1?
Compare length of bars
Via Jeffrey Heer
We perceive length much more precisely than area
Via Jeffrey Heer
Which is larger, A or C?
Which is larger, A or C?
Cleveland McGill: Ranking Accuracy, 1984
Cleveland and McGill, 1984 Journal of the American Statistical Association
Cleveland McGill: Ranking Accuracy, 1984
Graphics from Munzner, redrawn from Cleveland and McGill, 1984.
Unframed
Unaligned
Framed
Unaligned
Framed
Aligned
Cleveland McGill: Ranking Accuracy, 1984
Graphics from Munzner, redrawn from Cleveland and McGill, 1984.
Unframed
Unaligned
Framed
Unaligned
Framed
Aligned
Cleveland McGill: Ranking Accuracy, 1984
Graphics from Munzner, redrawn from Cleveland and McGill, 1984.
Unframed
Unaligned
Framed
Unaligned
Framed
Aligned
History: Bertin “Retinal Variables”, 1967...
History: Bertin “Retinal Variables”, 1967...
Redrawn by Mike Bostock https://medium.com/@mbostock/introducing-d3-scale-61980c51545f
...Cleveland McGill: Accuracy for Q. Perception, 1984
Cleveland and McGill, 1984 Journal of the American Statistical Association
Bostock & Heer’s replication in 2010
Healy, Data Visualization: A practical introduction
Today: via Munzner
Munzner. Visualization Analysis and Design (2015)
But: Channels × Data Type combos...
Graphics from Munzner. Visualization Analysis and Design (2015)
Tilt
Color
Shape
Position
Size
Ordinal
Categorical
Quantitative
×
Exercise: Color × Quantitative?
Categorical
Graphics from Munzner. Visualization Analysis and Design (2015)
Tilt
Color
Shape
Position
Size
Ordinal
Quantitative
×
Exercise: Color × Categorical?
Graphics from Munzner. Visualization Analysis and Design (2015)
Tilt
Color
Shape
Position
Size
Ordinal
Categorical
Quantitative
×
Exercise: Tilt × Categorical?
Graphics from Munzner. Visualization Analysis and Design (2015)
Tilt
Color
Shape
Position
Size
Ordinal
Categorical
Quantitative
×
Exercise: Tilt × Categorical?
Not all Channels work (equally) for all Data Types
Graphics from Munzner. Visualization Analysis and Design (2015)
Tilt
Color
Shape
Position
Size
Ordinal
Categorical
Quantitative
×
Not all Channels work (equally) for all Data Types
Quantitative
Graphics from Munzner. Visualization Analysis and Design (2015)
Tilt
Color
Shape
Position
Size
Ordinal
Categorical
×
Moritz Stefaner. Project Ukko
MacKinlay Effectiveness Rankings, 1986
Quantitative Ordinal Categorical
Position Position Position
Length Density (Value) Color Hue
Angle Color Sat Texture
Slope Color Hue Connection
Area (Size) Texture Containment
Volume Connection Density (Value)
Density (Value) Containment Color Sat
Color Sat Length Shape
Color Hue Angle Length
Texture Slope Angle
Connection Area (Size) Slope
Containment Volume Area
Shape Shape Volume
MacKinlay 1986
Munzner. Visualization Analysis and Design (2015)
Prioritize: What’s the most important thing you want to say? Map that to the most effective encoding.
Summary
Choose the mark and channels that maximize effectiveness of your data story.
Munzner. Visualization Analysis and Design (2015)
03.1
Deconstruct: Napoleon’s March
Via Roger Peng, John Hopkins Biostatistics
Minard, 1869.
What are the data components?
What are the data components?
Longitude (Q) ~ Position X
Latitude (Q) ~ Position Y
Army size (Q) ~ Width of Line
Direction (C) ~ Color
Landmarks (C) ~ Labels (cities, rivers)
What are the data components?
Longitude (Q) / Time Reversed (O) ~ Position X
Temperature (Q) ~ Position Y
Dataset: Temperature
lon temp date
37.6 0 18 Oct 1812
36.0 0 24 Oct 1812
33.2 -9 09 Nov 1812
32.0 -21 14 Nov 1812
29.2 -11 24 Nov 1812
28.5 -20 28 Nov 1812
27.2 -24 01 Dec 1812
26.7 -30 06 Dec 1812
25.3 -26 07 Dec 1812
Dataset: Army
lon lat size dir grp
24.0 54.9 340000 1 1
24.5 55.0 340000 1 1
25.5 54.6 340000 1 1
26.0 54.7 320000 1 1
…
37.65 55.65 100000 -1 1
37.45 55.62 98000 -1 1
37.0 55.0 97000 -1 1
36.8 55.0 96000 -1 1
…
24.0 55.1 60000 1 2
24.5 55.2 60000 1 2
25.5 54.7 60000 1 2
…
Dataset: Cities
lon lat city
24.0 55.0 Kowno
25.3 54.7 Wilna
26.4 54.4 Smorgoni
26.8 54.3 Molodexno
27.7 55.2 Gloubokoe
27.6 53.9 Minsk
28.5 54.3 Studienska
28.7 55.5 Polotzk
29.2 54.4 Bobr
30.2 55.3 Witebsk
30.4 54.5 Orscha
30.4 53.9 Mohilow
32.0 54.8 Smolensk
…
Single-axis composition along longitude
Longitude (Q)
Latitude (Q)
Army size (Q)
Direction (C)
Landmarks (C)
Temperature (Q)
Longitude (Q) / Time (O)
2018 Midterm Election Forecast
district state D R C "lean" incumb "flip?"
1st NY 11.2 88.8 0.001 (calc) R (calc)
…
Mark: point
Channels:
lean (O) ~ color
flip? (C) ~ texture (i.e. stripes)
geoXY ~ positionXY (prioritizes adjacency)
Questions?
Next Week…
Topics Next Week
Checklist For Next Week
Tips for Teamwork in Observable
Tips for EDA in Observable
💡 you can generate Plot code from your chart cells
A4: Explore > Sketch > Build in Observable
AirBnB Quantified by Kelli Anderson, Via Steve Heller, Infographic Sketchbooks.