1 of 90

Join at slido.com�#1196901

Click Present with Slido or install our Chrome extension to display joining instructions for participants while presenting.

1196901

2 of 90

Visualization II

KDEs, Transformations, and Visualization Theory

2

LECTURE 8

Data 100/Data 200, Spring 2024 @ UC Berkeley

Narges Norouzi and Joseph E. Gonzalez

1196901

3 of 90

Goals for this Lecture

Lecture 8, Data 100 Spring 2024

  • Learning more visualization functions
  • Visualizing relationships
  • Transforming data for conveying a more coherent message in visualization
  • Visualization theory and information channel

3

1196901

4 of 90

Where Are We?

4

Question & Problem

Formulation

Data

Acquisition

Exploratory Data Analysis

Prediction and

Inference

Reports, Decisions, and Solutions

?

Data Wrangling

Intro to EDA

Working with Text Data

Regular Expressions

Plots and variables

Seaborn

Viz principles

KDE/Transformations

(Part I: Processing Data)

(Part II: Visualizing and Reporting Data)

(today)

1196901

5 of 90

Agenda

Lecture 8, Data 100 Spring 2024

  • Kernel Density Estimation�Plotting Distributions - Revisited
  • Relationships between Quantitative Variables
    • Transformations
  • Visualization Theory
    • Information Channels
    • Harnessing X/Y
    • Harnessing Color
    • Harnessing Markings
    • Harnessing Conditioning
    • Harnessing Context

5

1196901

6 of 90

Plotting Distributions - Revisited

Lecture 8, Data 100 Spring 2024

  • Kernel Density Estimation
  • Plotting Distributions - Revisited
  • Relationships between Quantitative Variables
    • Transformations
  • Visualization Theory
    • Information Channels
    • Harnessing X/Y
    • Harnessing Color
    • Harnessing Markings
    • Harnessing Conditioning
    • Harnessing Context

6

1196901

7 of 90

Kernel Density Estimation: Intuition

Often, we want to identify general trends across a distribution, rather than focus on detail. Smoothing a distribution helps generalize the structure of the data and eliminate noise.

7

A KDE curve

Idea: approximate the probability distribution that generated the data.

  • Assign an “error range” to each data point in the dataset – if we were to sample the data again, we might get a different value.
  • Sum up the error ranges of all data points.
  • Scale the resulting distribution to integrate to 1.

1196901

8 of 90

Kernel Density Estimation: Process

8

Idea: Approximate the probability distribution that generated the data.

  • Place a kernel at each data point.
  • Normalize kernels so that total area = 1.
  • Sum all kernels together.

A kernel is a function that tries to capture the randomness of our sampled data.

A datapoint in our dataset

The kernel models the probability of us sampling that datapoint.

Area below integrates to 1

1196901

9 of 90

Step 1️⃣ – Place a Kernel at Each Data Point

Consider a fake dataset with just five collected datapoints.

  • Place a Gaussian kernel with bandwidth of alpha = 1.
  • We will precisely define both the Gaussian kernel and bandwidth in a few slides.

9

Each line represents a datapoint in the dataset

(e.g. one country’s HIV rate).

Place a kernel on top of each datapoint.

sns.rugplot(points, height=0.5)

1196901

10 of 90

Step 2️⃣ – Normalize Kernels

In Step 3, We will be summing each of these kernels to produce a probability distribution.

  • We want the result to be a valid probability distribution that has area 1.
  • We have 5 different kernels, each with an area 1.
  • So, we normalize by multiplying each kernel by ⅕.

10

Each kernel has area 1.

Each normalized kernel has density ⅕.

1196901

11 of 90

Step 3️⃣ – Sum the Normalized Kernels

At each point in the distribution, add up the values of all kernels. This gives us a smooth curve with area 1 – an approximation of a probability distribution!

11

Sum these five normalized curves together.

The final KDE curve.

1196901

12 of 90

Result

  • A summary of the distribution using KDE.

12

Each line represents a datapoint in the dataset

(e.g. one country’s HIV rate).

The density at each point corresponds to the KDE calculated based on kernels placed on all data points

1196901

13 of 90

Summary of KDE

A general “KDE formula” function is given above.

  • K𝝰(x, xi) is the kernel function centered on the observation i.
    • Each kernel individually has area 1.
    • K represents our kernel function of choice. We’ll talk about the math of these functions soon.

13

1️⃣

2️⃣

3️⃣

K1(x, 2)

K1(x, 6)

1️⃣

1196901

14 of 90

Summary of KDE

A general “KDE formula” function is given above.

  • K𝝰(x, xi) is the kernel centered on the observation i.
    • Each kernel individually has area 1.
    • x represents any number on the number line. It is the input to our function.
  • n is the number of observed data points that we have.
    • We multiply by 1/n to normalize the kernels so that the total area of the KDE is still 1.
  • Each xi (x1, x2, …, xn) represents an observed data point. We sum the kernels for each datapoint to create the final KDE curve.

𝝰 is the bandwidth or smoothing parameter.

14

1️⃣

2️⃣

3️⃣

K1(x, 2)

K1(x, 6)

1️⃣

2️⃣

3️⃣

1196901

15 of 90

Kernels

A kernel (for our purposes) is a valid density function, meaning:

  • It must be non-negative for all inputs.
  • It must integrate to 1(area under curve = 1).

Memorizing this formula is less important than knowing the shape and how the bandwidth parameter 𝝰 smoothes the KDE.

15

The most common kernel is the Gaussian kernel.

  • Gaussian = Normal distribution = bell curve.
  • Here, x represents any input, and xi represents the ith observed value (datapoint).
  • Each kernel is centered on our observed values (and so its distribution mean is xi).
  • 𝝰 is the bandwidth parameter. It controls the smoothness of our KDE. Here, it is also the standard deviation of the Gaussian.

1196901

16 of 90

Effect of Bandwidth on KDEs

Bandwidth is analogous to the width of each bin in a histogram.

  • As 𝝰 increases, the KDE becomes more smooth.
  • Large 𝝰 KDE is simpler to understand, but gets rid of potentially important distributional information (e.g. multimodality).

16

1196901

17 of 90

Other Kernels: Boxcar

As an example of another kernel, consider the boxcar kernel.

  • It assigns uniform density to points within a “window” of the observation, and 0 elsewhere.
  • Resembles a histogram… sort of.

  • Not of any practical use in Data 100! Presented as a simple theoretical alternative.

17

A boxcar kernel centered on xi = 4 with 𝝰 = 2.

1196901

18 of 90

Which of the following are valid kernel density plots?

Click Present with Slido or install our Chrome extension to activate this poll while presenting.

1196901

19 of 90

Plotting Distributions - Revisited

Lecture 8, Data 100 Spring 2024

  • Kernel Density Estimation
  • Plotting Distributions - Revisited
  • Relationships between Quantitative Variables
    • Transformations
  • Visualization Theory
    • Information Channels
    • Harnessing X/Y
    • Harnessing Color
    • Harnessing Markings
    • Harnessing Conditioning
    • Harnessing Context

19

1196901

20 of 90

displot

displot is a wrapper for histplot, kdeplot, and ecdfplot to plot distributions.

20

sns.displot(data=wb,

x="gni",

kind="hist",

stat="density")

sns.displot(data=wb,

x="gni",

kind="kde")

sns.displot(data=wb,

x="gni",

kind="ecdf")

ECDF: Empirical Cumulative Distribution Function

1196901

21 of 90

Relationships between Quantitative Variables

Lecture 8, Data 100 Spring 2024

  • Kernel Density Estimation
  • Plotting Distributions - Revisited
  • Relationships between Quantitative Variables
    • Transformations
  • Visualization Theory
    • Information Channels
    • Harnessing X/Y
    • Harnessing Color
    • Harnessing Markings
    • Harnessing Conditioning
    • Harnessing Context

21

1196901

22 of 90

From Distributions to Relationships

Up until now, we focused exclusively on visualizing variable distributions.

Now we will visualize relationships between variables. In other words, how do sets of two (or more) variables vary in relation to one another?

22

1196901

23 of 90

Scatter Plots

Scatter plots are used to reveal relationships between pairs of numerical variables.

  • Visual assessment may help us decide how to model these relationships.

  • Example: Linear model
    • Linear Regression (Data 8)
    • Good for the left two, not so much for the right two.
  • Reminder: "Correlation does not imply causation." A linear relationship is a mathematical one.

23

simple linear

simple nonlinear

linear, spreading

v-shaped

relationship appears linear, but with increasing spread as x gets larger

1196901

24 of 90

Scatter Plots

Scatter plots are used to reveal relationships between two quantitative variables [Documentation].

  • Plot one quantitative continuous variable on the x-axis, and second quantitative continuous variable on the y-axis.
  • Each scatter point represents one datapoint in the dataset.

24

plt.scatter(x_values, y_values)

sns.scatterplot(data=df, x="x_column", \

y="y_column", hue="hue_column")

1196901

25 of 90

Overplotting

The plot on the previous slide suffered from overplotting – scatter points all stacked on top of one another are difficult to see.

Jittering: adding a small amount of random noise to all x and y values to slightly move each scatter point. Main trends are still present, but individual datapoints are easier to distinguish.

25

x_noise = np.random.uniform(-1, 1, len(wb))

y_noise = np.random.uniform(-5, 5, len(wb))

plt.scatter(wb['% growth'] + x_noise, \

wb['Literacy rate: Female'] + y_noise, \

s=15);

Decreasing point size also helps. s specifies the marker size in Matplotlib.

1196901

26 of 90

Scatter Plot Alternatives

Seaborn includes several built-in functions for making more complex scatter plots.

26

sns.lmplot(data=df, \

x="x_column", y="y_column")

sns.jointplot(data=df, \

x="x_column", y="y_column")

1196901

27 of 90

Hex Plots

Rather than plot individual datapoints, plot the density of their joint distribution.

Can be thought of as a two dimensional histogram.

  • The xy plane is binned into hexagons.
  • More shaded hexagons typically indicate a greater density/frequency = more datapoints lie in that spot

27

sns.jointplot(data=df, x="x_column", \ y="y_column", kind="hex")

1196901

28 of 90

Contour Plots

2-dimensional version of a KDE plot.

Similar to a topographic map – contour lines represent an area that has the same density of datapoints throughout. Darker colors indicate more datapoints in the region.

28

sns.kdeplot(data=df, x="x_column", y="y_column", fill=True)

Dark color → many datapoints

1196901

29 of 90

Summary

  • Visualization requires a lot of thought!
  • Many tools for visualizing distributions.
    • Distribution of a single variable: rug plot, histogram, density plot, box, violin.
    • Joint distribution of two quantitative variables: scatter plot, hex plot, contour plot.
  • This class primarily uses seaborn and matplotlib.
    • Pandas also has basic built-in plotting methods.
    • Many other visualization libraries exist. plotly is one of them.
      • It very easily creates interactive plots.
      • plotly will occasionally appear in lecture code, labs, and assignments!

Next, we’ll go deeper into the theory behind visualization.

29

1196901

30 of 90

Transformations

Lecture 8, Data 100 Spring 2024

  • Kernel Density Estimation
  • Plotting Distributions - Revisited
  • Relationships between Quantitative Variables
    • Transformations
  • Visualization Theory
    • Information Channels
    • Harnessing X/Y
    • Harnessing Color
    • Harnessing Markings
    • Harnessing Conditioning
    • Harnessing Context

30

1196901

31 of 90

Visualization Theory

Remember our goals of visualization:

  1. To help your own understanding of your data/results.
  2. To communicate results/conclusions to others.

These are influenced by our choice of visualization and our choices in how to prepare data for visualization.

31

What problems are there here?

  • Data is "smushed" – hard to interpret, even if we jittered.
  • Difficult to generalize a clear relationship between the variables.

We often transform a dataset to help prepare it for being visualized.

1196901

32 of 90

Linearization

When applying transformations, we often want to linearize the data – rescale the data so the x and y variables share a linear relationship.

32

Why?

  • Linear relationships are simple to interpret – we know how to work with slopes and intercepts to understand how two variables are related.
  • Starting next week, we will start building linear models – these are more effective with linearized data.

1196901

33 of 90

Applying Transformations

What makes this plot non-linear?

33

  1. A few large outlying x values are distorting the horizontal axis.

2. Many large y values are all clumped

together, compressing the vertical axis.

1196901

34 of 90

Applying Transformations

What makes this plot non-linear?

34

  • A few large outlying x values are distorting the horizontal axis.

Resolve by log-transforming the x data:

  • Taking the log of a large number decreases its value significantly.
  • Taking the log of a small number does not change its value as significantly.

1196901

35 of 90

Applying Transformations

What makes this plot non-linear?

35

2. Many large y values are all clumped together, compressing the vertical axis.

Resolve by power-transforming the y data:

  • Raising a large number to a power increases its value significantly.
  • Raising a small number to a power does not change its value as significantly.

1196901

36 of 90

Interpreting Transformed Data

Now, we see a linear relationship between the transformed variables.

36

This tells us about the underlying relationship between the original x and y!

1196901

37 of 90

Tukey-Mosteller Bulge Diagram

The Tukey-Mosteller Bulge Diagram is a guide to possible transforms to try to get linearity.

  • A visual summary of the reasoning we just worked through.
  • sqrt and log make a value "smaller".
  • Raising to a value to a power makes it "bigger".
  • There are multiple solutions. Some will fit better than others.

37

You should still understand the logic we just worked through to decide how to transform the data. The bulge diagram is just a summary.

1196901

38 of 90

Tukey-Mosteller Bulge Diagram

38

If the data bulges like this…

…or transform x by this

…transform y by this

Could have transformed y by y2, y3

Could have transformed x by log(x), sqrt(x)

Applying to the data from before:

1196901

39 of 90

Visualization Theory

Lecture 8, Data 100 Spring 2024

  • Kernel Density Estimation
  • Plotting Distributions - Revisited
  • Relationships between Quantitative Variables
    • Transformations
  • Visualization Theory
    • Information Channels
    • Harnessing X/Y
    • Harnessing Color
    • Harnessing Markings
    • Harnessing Conditioning
    • Harnessing Context

39

1196901

40 of 90

Visualizations Are For Humans

40

“Looks like older people didn’t spend more money on tickets for the Titanic than younger people.”

(Note: A histogram or KDE would give stronger evidence than a scatter plot.)

1196901

41 of 90

Visualizations Are More Expressive than Summary Statistics

41

Each of these 13 datasets has the same mean, standard deviation, and correlation coefficient.

Visualizations complement statistics.

1196901

42 of 90

Information Channels

Lecture 8, Data 100 Spring 2024

  • Kernel Density Estimation
  • Plotting Distributions - Revisited
  • Relationships between Quantitative Variables
    • Transformations
  • Visualization Theory
    • Information Channels
    • Harnessing X/Y
    • Harnessing Color
    • Harnessing Markings
    • Harnessing Conditioning
    • Harnessing Context

42

1196901

43 of 90

Take Advantage of the Human Visual Perception System

Data can be visualized in many ways!

  • Let’s deconstruct the most basic plot types.

43

1196901

44 of 90

Rug Plot: Encoding 1 Variable

44

...

...

10px

16px

11px

NONE

11px

15px

Encoding

(Maps datum to visual position)

Mark

(Represents a datum)

1196901

45 of 90

Rug Plot: Different Marks

45

...

10px

16px

11px

NONE

11px

15px

Encoding

(Maps datum to visual position)

Mark

(Represents a datum)

...

1196901

46 of 90

Scatter Plot: Encoding 2 Variables

46

Encoding

(Maps datum to visual position)

Mark

(Represents a datum)

...

(10px, 7px)

(70px, 60px)

(45px, 9px)

(5px, 24px)

(45px, 37px)

(66px, 8px)

...

1196901

47 of 90

Going Beyond: Encoding 3+ Variables

How many variables are we encoding here?

  • In other words, how many "channels" of information are there?

47

1196901

48 of 90

How many variables are we encoding here?

Click Present with Slido or install our Chrome extension to activate this poll while presenting.

1196901

49 of 90

Going Beyond: Encoding 3+ Variables

How many variables are we encoding here?

  • In other words, how many “channels” of information are there?

We could add even more: Shapes, outline colors of shapes, shading, etc.�There are infinite possibilities!

49

Answer: 4.

  • x
  • y
  • area
  • color

1196901

50 of 90

Abusing Encodings: Length

There are many things that can go wrong in a visualization. For example, the visualization below abuses the length channel:

For the next huge chunk of today’s lecture, we’ll dive into ways to properly use other aspects of a visualization:

  • x/y
  • Color
  • Markings
  • Conditioning
  • Context

50

?? This is a very famous paper, but I’m not sure why Mackinlay thinks the bar chart would suggest USA cars are longer ??

1196901

51 of 90

Harnessing X/Y

Lecture 8, Data 100 Spring 2024

  • Kernel Density Estimation
  • Plotting Distributions - Revisited
  • Relationships between Quantitative Variables
    • Transformations
  • Visualization Theory
    • Information Channels
    • Harnessing X/Y
    • Harnessing Color
    • Harnessing Markings
    • Harnessing Conditioning
    • Harnessing Context

51

1196901

52 of 90

Case Study: Planned Parenthood Hearing

In 2015, Planned Parenthood was accused of selling aborted fetal tissue for profit.

Congressman Chaffetz (R-UT) showed this plot which originally appeared in a report by Americans United for Life.

  • What is this graph plotting?
  • What message is this plot trying to convey?
  • Is anything suspicious?

52

1196901

53 of 90

Keep Axis Scales Consistent

The scales for the two lines are completely different!

In 2013:

  • 327000 is smaller than 935573…
  • …but appears to be way bigger??

53

Do not use two different scales for the same axis!

1196901

54 of 90

Always Consider the Scale When Comparing "Similar" Data

The top plot draws all of the data on the same scale.

  • It clearly shows there was a dramatic drop in cancer screenings by PP.
  • But there are still far more cancer screenings than abortions.
  • Can plot percentage change instead of raw counts (bottom). This shows that cancer screenings have decreased and abortions have increased, without being misleading.

54

1196901

55 of 90

Always Consider the Scale When Comparing "Similar" Data

We could also visualize abortions and cancer screenings as a percentage of total procedures.

  • Abortions increased from 13% to 26% of total procedures.

55

1196901

56 of 90

Reveal the Data

Recommendations:

  • Choose axis limits to fill the visualization.
  • You don’t have to visualize all of the data at once:
    • Zoom in on the bulk of the data (it's ok to not include 0!) if only one part matters.
    • Can also create multiple plots to show different regions of interest.

56

1196901

57 of 90

Reveal the Data

Recommendations:

  • Choose axis limits to fill the visualization.
  • You don’t have to visualize all of the data at once:
    • Zoom in on the bulk of the data (it’s ok to not include 0!) if only one part matters.
    • Can also create multiple plots to show different regions of interest.

Terrible White House COVID-19 visualization:

  • Mysterious maximum value on y-axis.�

57

1196901

58 of 90

Harnessing Color

Lecture 8, Data 100 Spring 2024

  • Kernel Density Estimation
  • Plotting Distributions - Revisited
  • Relationships between Quantitative Variables
    • Transformations
  • Visualization Theory
    • Information Channels
    • Harnessing X/Y
    • Harnessing Color
    • Harnessing Markings
    • Harnessing Conditioning
    • Harnessing Context

58

1196901

59 of 90

Choosing a set of colors which work together is a challenging task!

Perception of Color

59

Download the Color Oracle App to simulate common color vision impairments.

1196901

1196901

60 of 90

Colormaps

60

Jet

Viridis

1196901

61 of 90

The Jet/Rainbow Colormap Actively Misleads

61

"Rainbow Colormap (Still) Considered Harmful", Borland and Taylor, 2007.

1196901

62 of 90

Use a Perceptually Uniform Colormap!

  • Perceptually uniform colormaps have the property that if the data goes from 0.1 to 0.2, the perceptual change is the same as when the data goes from 0.8 to 0.9.
  • Jet, the old matplotlib default, was far from uniform.
  • Viridis, the new default colormap, is.
    • It was created by folks at the Berkeley Institute of Data Science!
    • https://bids.github.io/colormap/
  • Avoid combinations of red and green, due to red-green color blindness.

62

x-axis is color,�y-axis is “lightness

Slope is constant

Bounces all over

1196901

63 of 90

Except When Not :) The Google Turbo Colormap

63

X-axis is color, y-axis is “lightness

1196901

64 of 90

Use Color to Highlight Data Type

  • Qualitative: Choose a qualitative scheme that makes it easy to distinguish between categories.
    • One category isn’t "higher" or "lower" than another.
  • Quantitative: Choose a color scheme that visualizes magnitude of change.

The plot on the right has both distinctions!

64

1196901

65 of 90

Sequential vs. Diverging Colormaps for Quantitative Data

If the data progresses from low to high, use a sequential scheme where lighter colors are for more extreme values.

If low and high values deserve equal emphasis, use a diverging scheme where lighter colors represent middle values.

65

1196901

66 of 90

Default matplotlib Colormaps

66

1196901

67 of 90

Harnessing Markings

Lecture 8, Data 100 Spring 2024

  • Kernel Density Estimation
  • Plotting Distributions - Revisited
  • Relationships between Quantitative Variables
    • Transformations
  • Visualization Theory
    • Information Channels
    • Harnessing X/Y
    • Harnessing Color
    • Harnessing Markings
    • Harnessing Conditioning
    • Harnessing Context

67

1196901

68 of 90

The accuracy of our judgements depend on the type of marking.

Perception of Markings

68

1196901

1196901

69 of 90

How much longer is the long bar?

69

🤔

1196901

70 of 90

How much longer is the long bar?

Click Present with Slido or install our Chrome extension to activate this poll while presenting.

1196901

71 of 90

The long bar is 7 times longer than the short bar.

71

1196901

72 of 90

How much bigger is the big circle?

72

🤔

1196901

73 of 90

How much bigger is the big circle?

Click Present with Slido or install our Chrome extension to activate this poll while presenting.

1196901

74 of 90

The area of the big circle is 7 times larger than the area of the small circle.

74

1196901

75 of 90

Lengths Are Easy to Distinguish. Others, Like Angles, Are Hard.

Don’t use pie charts! Visual angle judgments are inaccurate.

75

1196901

76 of 90

Areas Are Hard to Distinguish

(South Africa has twice the GDP of Algeria, but that isn’t clear from the areas.)

Avoid area charts!�Visual area judgments are inaccurate.

76

1196901

77 of 90

Areas Are Hard to Distinguish

Avoid word clouds too!

It’s hard to tell the area taken up by a word.

77

…that being said, if you are not trying to make quantifiable comparisons, then word clouds are useful for “the idea.”

1196901

78 of 90

Avoid "Jiggling" the Baseline!

Stacked bar charts, histograms, and area charts are hard to read because the baseline moves ("jiggles").

78

In the second plot:

  • Comparing the number of 15-64 year old males in Germany and Mexico is difficult.

In the first plot:

  • The top blue bars are all roughly of the same length.
  • Not�immediately�obvious!

1196901

79 of 90

Avoid Jiggling the Baseline

Here, by switching to a line plot, comparisons are made much easier.

79

1196901

80 of 90

Harnessing Conditioning

Lecture 8, Data 100 Spring 2024

  • Kernel Density Estimation
  • Plotting Distributions - Revisited
  • Relationships between Quantitative Variables
    • Transformations
  • Visualization Theory
    • Information Channels
    • Harnessing X/Y
    • Harnessing Color
    • Harnessing Markings
    • Harnessing Conditioning
    • Harnessing Context

80

1196901

81 of 90

Use Conditioning to Aid Comparison

This data comes from the Bureau of Labor Statistics, who oversees surveys regarding the economic health of the US. They have plotted median weekly earnings for men and women by education level.

  • What comparisons are made easily with this plot?
  • What comparisons are most interesting and important?

81

1196901

82 of 90

Use Conditioning to Aid Comparison

This data comes from the Bureau of Labor Statistics, who oversees surveys regarding the economic health of the US. They have plotted median weekly earnings for men and women by education level.

  • What comparisons are made easily with this plot?
  • What comparisons are most interesting and important?

82

  • Easy to see the effect of education on earnings.
  • Hard to compare between the two genders in the dataset.

How could we more easily make this difficult comparison?

1196901

83 of 90

Use Conditioning to Aid Comparison

83

  • Easy to see the effect of education on earnings.
  • Hard to compare between the two genders in the dataset.

Having two separate lines makes clear the wage difference between men and women.

1196901

84 of 90

How Does the Income Gap Increase with Education?

84

See notebook for how to get this figure with groupby!

1196901

85 of 90

Other Notes: Superposition vs. Juxtaposition

Superposition: placing multiple density curves, scatter plots on top of each other (what we’ve usually been doing)

Juxtaposition: placing multiple plots side by side, with the same scale (called “small multiples”) (see left).

85

An example of small multiples.

1196901

86 of 90

Harnessing Context (for Publication)

Lecture 8, Data 100 Spring 2024

  • Kernel Density Estimation
  • Plotting Distributions - Revisited
  • Relationships between Quantitative Variables
    • Transformations
  • Visualization Theory
    • Information Channels
    • Harnessing X/Y
    • Harnessing Color
    • Harnessing Markings
    • Harnessing Conditioning
    • Harnessing Context

86

1196901

87 of 90

Getting Ready for Publication

87

1196901

88 of 90

Publication-Ready: Add Context Directly to Plot

A publication-ready plot needs:

  • Informative title (takeaway, not description).
    • "Older passengers spend more on plane tickets" instead of "Scatter plot of price vs. age".
  • Axis labels.
  • Reference lines, markers, and labels for important values.
  • Legends, if appropriate.
  • Captions that describe the data.

The plots you create in this class always need titles and axis labels.

88

1196901

89 of 90

Publication-Ready: Captions

A publication-ready plot needs:

  • Informative title (takeaway, not description).
    • “Older passengers spend more on plane tickets” instead of “Scatter plot of price vs. age”.
  • Axis labels.
  • Reference lines, markers, and labels for important values.
  • Legends, if appropriate.
  • Captions that describe the data.

The plots you create in this class always need titles and axis labels.

A picture is worth a thousand words, but not all thousand words you want to tell may be in the picture. In many cases, we need captions to help tell the story:

  • Comprehensive and self-contained.
  • Describe what has been graphed.
  • Draw attention to important features.
  • Describe conclusions drawn from graph.

89

1196901

90 of 90

Visualization II

90

LECTURE 8

Content credit: Acknowledgments

1196901