1 of 22

CSE 163

Data Visualization

Hunter Schafer

2 of 22

What to Learn

  • This week, we are learning more about pandas and learning 2 new libraries
  • Memorizing the function calls and parameters is ridiculous
    • No one memorizes this stuff!
    • This is what documentation is for!
  • Much more important to understand the big ideas behind what the library call is doing
    • You might use a different library in the future
    • They change from version to version
  • Don’t try to write every bit of syntax down, focus on the big ideas behind what we are trying to solve and use the slides and lecture notes as a resource.
  • On the exam, we will provide shortened documentation so you don’t have to memorize the method calls

2

3 of 22

Some Data

  • Can you tell me the relationship between x and y?

3

x

y

10.0

7.46

8.0

6.77

13.0

12.74

9.0

7.11

11.0

7.81

14.0

8.84

6.0

6.08

4.0

5.39

12.0

8.15

7.0

6.42

5.0

5.73

I

II

III

IV

x

y

x

y

x

y

x

y

10.0

8.04

10.0

9.14

10.0

7.46

8.0

6.58

8.0

6.95

8.0

8.14

8.0

6.77

8.0

5.76

13.0

7.58

13.0

8.74

13.0

12.74

8.0

7.71

9.0

8.81

9.0

8.77

9.0

7.11

8.0

8.84

11.0

8.33

11.0

9.26

11.0

7.81

8.0

8.47

14.0

9.96

14.0

8.10

14.0

8.84

8.0

7.04

6.0

7.24

6.0

6.13

6.0

6.08

8.0

5.25

4.0

4.26

4.0

3.10

4.0

5.39

19.0

12.50

12.0

10.84

12.0

9.13

12.0

8.15

8.0

5.56

7.0

4.82

7.0

7.26

7.0

6.42

8.0

7.91

5.0

5.68

5.0

4.74

5.0

5.73

8.0

6.89

    • μX = 9.0 σX = 3.317 μy = 7.5 σy = 2.03
    • y = 3 + 0.5x R2 = 0.67

[Anscombe 73]

4 of 22

Some Data Visualizations

4

5 of 22

The Value of Visualization

  • Answer questions (or discover them)
  • Make decisions
  • See data in context
  • Find patterns
  • Present argument or tell a story
  • Inspire
  • Answer questions (or discover them)
  • Make decisions
  • See data in context
  • Find patterns
  • Present argument or tell a story
  • Inspire

5

6 of 22

Make a Decision:

Challenger

6

7 of 22

Make a Decision:

Challenger

Visualizations drawn by Tufte show how low temperatures damage O-rings [Tufte 97]

7

8 of 22

Data in Context:

Cholera Outbreak

8

In 1854 John Snow plotted the position of each cholera case on a map. [from Tufte 83]

9 of 22

Data in Context:

Cholera Outbreak

9

Used map to hypothesize that pump on Broad St. was the cause. [from Tufte 83]

10 of 22

Data Viz in Python

  • Many libraries exist to do graphics and plotting visualizations
  • The most popular two are
    • matplotlib: extremely customizable (to a fault)
    • seaborn: great out-of-the-box visualizations
  • We will encourage using seaborn since it’s quickly becoming the standard for data scientists for most of their work
    • We still will rely on matplotlib for some things
    • You are welcome to use it more in depth on your project to make highly customized charts

10

11 of 22

Reading Documentation

  • Seaborn has excellent docs, great practice for reading!

  • How to read documentation
    1. Skim examples, don’t focus too much on code
    2. Read overview
    3. Look at examples and the code. Look at documentation for relevant parameters
    4. (Sometimes) Skim parameter list

11

12 of 22

Visual Design Principles

  • Use effective encodings
  • Avoid over-encoding
  • Focus on readability, not aesthetics
  • Avoid perceptual traps that can trick people in forming wrong conclusions

Rarely does a single visualization answer all questions. Instead, the ability to generate appropriate visualizations quickly is critical!

12

13 of 22

Types of Data

  • Quantitative (numeric measurement)
    • Examples
      • Salary
      • age
  • Ordinal (categorical, but there is an ordering)
    • Examples
      • First place, second place, third place
      • High School, Associates, Bachelors, …
  • Nominal (categorical, no ordering)
    • Examples
      • Type of Pokemon
      • School you attend

13

14 of 22

Encoding Effectiveness

[Mackinlay 86]

14

15 of 22

Example

Effectiveness

15

16 of 22

Encoding Effectiveness

[Mackinlay 86]

16

17 of 22

Example

Effectiveness

17

18 of 22

Encoding Effectiveness

[Mackinley 86]

18

19 of 22

Readability

19

20 of 22

Readability

  • Did the number of gun deaths go up or down after the law was enacted?
    • Went up
    • Went down
    • About the same
    • Don’t remember

20

Yes, this was a real graphic posted in an article by Business Insider (graphic made by Reuters)

21 of 22

How to Lie with Data Viz

21

22 of 22

How to Lie with Data Viz

22