1 of 170

Week 12: In-Progress Crits

Introduction to Data Visualization

W4995.003 Spring 2024

2 of 170

Guest Critics

THIS WEEK

  • Matthew Conlen
    • Data journalist (Our World in Data, New York Times, 538) and computer scientist (Jeffrey Heer’s lab)

NEXT WEEK…

  • Asad Pervaiz
    • Creative director and graphic design professor at M/I/C/A & Pratt Institute

IN TWO WEEKS…

  • Klara Auerbach
    • Illustrator and data visualization designer (Our World in Data, The Atlantic)

3 of 170

Time Limits

  • 5 min. Presentation

  • 7 min. Feedback

Use the feedback form for any comments you weren’t able to make in class.

4 of 170

01 Mapping Campus Safety

02 US Crime Rates (1975–2015)

03 NYC Bike Ridership Patterns

04 Nutritional Profiles of Starbucks Drinks

05 March Madness

06 Violent Crime & the Subway

07 Representation of Art/Artists at the MoMA

08 Engineering the Perfect Track

09 The Art of Perfecting Sleep

5 of 170

01 Mapping Campus Safety

02 US Crime Rates (1975–2015)

03 NYC Bike Ridership Patterns

04 Nutritional Profiles of Starbucks Drinks

05 March Madness

06 Violent Crime & the Subway

07 Representation of Art/Artists at the MoMA

08 Engineering the Perfect Track

09 The Art of Perfecting Sleep

6 of 170

Mapping Campus Safety

7 of 170

Table of contents

01

02

03

04

05

Motivation

Prior Work

Exploratory Visualizations

Sketches

Questions

8 of 170

Motivation

Safety Concerns

  • Traveling at night
  • Areas to avoid
  • Route selection

Goal

  • Spread knowledge of high risk travel areas and times
  • Safer pedestrian travel

9 of 170

10 of 170

Data Story

We are informing pedestrian travel at Columbia University to enable safer student transit.

11 of 170

12 of 170

13 of 170

14 of 170

Wireframes

15 of 170

16 of 170

17 of 170

18 of 170

Wireframe

19 of 170

D3 Visualization

20 of 170

21 of 170

Questions

  1. What do you think is the most engaging way to gamify a map?
  2. How would you design the user interactivity?
  3. Is there a way to make location data more attractive?

22 of 170

23 of 170

01 Mapping Campus Safety

02 US Crime Rates (1975–2015)

03 NYC Bike Ridership Patterns

04 Nutritional Profiles of Starbucks Drinks

05 March Madness

06 Violent Crime & the Subway

07 Representation of Art/Artists at the MoMA

08 Engineering the Perfect Track

09 The Art of Perfecting Sleep

24 of 170

5.2: In-Progress critique

nna2132, rgd2127, na3062, tb3061

25 of 170

Motivation

To provide a comprehensive view of crime trends in the US over four decades, presenting data in an accessible manner to a wide audience. This could help demystify perceptions about crime rates and their fluctuations over time.

26 of 170

Hypotheses

Trend Over Time: Fluctuations in crime rates over four decades reflect underlying historical, economic, or policy changes.

Geographic Variance: Significant differences in crime rates are evident across regions, with certain areas consistently showing higher or lower rates.

Population Impact: A correlation exists between population size and crime rates per capita, suggesting different crime dynamics in larger vs. smaller jurisdictions.

Type of Crime Evolution: The prevalence of specific violent crimes (e.g., homicides, rapes) has changed over time, indicating shifts in crime patterns.

27 of 170

Relevant prior work

FBI's Crime Data Explorer Inspirational for its comprehensive approach to presenting crime data.

https://cde.ucr.cjis.gov/LATEST/webapp/#/pages/home

28 of 170

Relevant prior work

"The Next to Die" by The Marshall Project�An example of how to effectively use narrative and visualization for exploration.

https://www.themarshallproject.org/next-to-die

29 of 170

Relevant prior work

Out of Sight, out of Mind�Inspirational for its animation narration

https://drones.pitchinteractive.com/

30 of 170

Exploratory visualization

Hypothesis 1: Trend Over Time – Fluctuations in crime rates over four decades reflect underlying historical, economic, or policy changes.

A line chart showing the overall crime rates in the US from 1975 to 2015.

31 of 170

Exploratory visualization

Hypothesis 2: Geographic Variance – Significant differences in crime rates are evident across regions, with certain areas consistently showing higher or lower rates.

A bar chart showing the overall crime rates in different states of the US from 1975 to 2015.

32 of 170

Exploratory visualization

Hypothesis 3: Population Impact – A correlation exists between population size and crime rates per capita, suggesting different crime dynamics in larger vs. smaller jurisdictions.

Stacked bar chart from 1975 to 2015 reveals correlation between population size and crime rate per capita which changes over time for each state.

33 of 170

Key Takeaway from our visualization

Over four decades, the landscape of US crime reveals a nuanced narrative

  1. Explore Yearly Trends: A mid-90s peak in robberies and assaults reflecting societal shifts
  2. Analyze State-by-State Data: A compelling correlation between population size and crime rates, underscoring the diverse crime dynamics across states and counties.

34 of 170

Data Story

An intricate tapestry of societal change, seen through the lens of crime data, underscores the evolving battle against crime in the U.S

35 of 170

Rough Visualization in Code

36 of 170

Sketches of how final visualization will interact

37 of 170

Feedback on…

Q1. Is our narrative cohesive? And how can we make it better?

Q2. What common pitfalls should we avoid in data storytelling?

Q3. Could you please advise on whether it's better to present information regarding numerous counties across the country at once, or would it be preferable to structure the data by state, offering detailed graphs for each county upon the selection of a state?

Q4. When aiming to cater to an audience without expertise, which type of visualization would be most easily understood: a choropleth map, a cartogram, or a bubble map?

38 of 170

Thank You

39 of 170

40 of 170

01 Mapping Campus Safety

02 US Crime Rates (1975–2015)

03 NYC Bike Ridership Patterns

04 Nutritional Profiles of Starbucks Drinks

05 March Madness

06 Violent Crime & the Subway

07 Representation of Art/Artists at the MoMA

08 Engineering the Perfect Track

09 The Art of Perfecting Sleep

41 of 170

Visual Insights from a Decade of Citi Bikes in New York

Joric Barber, Franco Magalhaes, Martha Njuguna, Margot Stern

42 of 170

1: Initial Hypotheses

43 of 170

1.1 Motivation

We want to highlight how Citi Bike use has evolved in the past 10 years, offering the City an effective transportation alternative that serves different New Yorkers in different ways.

There are also ways in which the service falls short: Citi Bike is worse in low-income neighborhoods, not offered in some neighborhoods at all, and undermined by the City’s failure to build protected bike lanes.

44 of 170

1.2 Hypotheses

  1. Across Time: Investigating how Citi Bike usage evolved over time, including the hypothesis that ridership spiked post-COVID lockdown in 2020, as well as exploring correlations between ridership and changes in public transportation costs.

  • General Statistics: Exploring areas of NYC with varying likelihoods of Citi Bike usage and their interaction with social factors, analyzing usage patterns throughout the day, identifying peak bike traffic times and areas, and examining the demographics of the most common Citi Bike users across gender and age.

45 of 170

2: Relevant Prior Work

46 of 170

2.1 Reference

In this New York Times article, interaction design is paired with the text in a way that amplifies the reader’s understanding of the information relayed.

47 of 170

2.2 Reference

In this interactive exploration of Boston’s subway system, the scroll/hover feature facilitates an in-depth analysis of transportation data throughout time.

48 of 170

2.3 Reference

In this explanatory visualization of delivery workers’ working conditions, mapping and scrolling features are jointly employed to relay routes and various figures related to overtime, instances of theft, and work-related expenses.

49 of 170

3: Exploratory Visualizations

50 of 170

3.1

Leading up to the COVID-19 pandemic, men made up the bulk of Citi Bike’s user base. Ridership levels peaked in 2018 before steadily decreasing until 2020.

51 of 170

3.2

From 2013 until 2020, a large majority of rides departed from stations in lower- and mid-Manhattan.

52 of 170

3.3

Rides tend to peak at 8AM and 5PM daily.

53 of 170

4: Data Story

54 of 170

Since its launch in 2013, Citi Bike has amassed a diverse ridership and fundamentally transformed how people navigate and experience the largest city in the United States.

55 of 170

5: An Initial Viz

56 of 170

57 of 170

6: Sketches

58 of 170

59 of 170

60 of 170

7: Feedback

61 of 170

Questions

  1. Since there are so many dimensions to the data (the types and gender of riders, types, durations and locations of trips), which dimensions do you think are most important to focus on?
  2. What are potential limitations in the data or visualization that we should be aware of, and how could they be addressed?
  3. How do we balance/accomplish relaying the more pragmatic, high-level insights from the data alongside a more “fun,” personable view of how Citi Bikes play a role in people’s day-to-day lives?

62 of 170

63 of 170

01 Mapping Campus Safety

02 US Crime Rates (1975–2015)

03 NYC Bike Ridership Patterns

04 Nutritional Profiles of Starbucks Drinks

05 March Madness

06 Violent Crime & the Subway

07 Representation of Art/Artists at the MoMA

08 Engineering the Perfect Track

09 The Art of Perfecting Sleep

64 of 170

Nutritional Profiles of Starbucks Drinks

Ha Yeon Kim, Claire Chen, Anusha Lavanuru

65 of 170

Data Story

66 of 170

Jennie, a Starbucks enthusiast, is currently on a diet wishes to check the nutritional information prior to ordering her beverage.

Given that she typically customizes her drink, relying solely on the app is insufficient for her needs.

67 of 170

Motivation & Hypothesis

68 of 170

Problem

The current Starbucks menu only shows the total calorie per drink, helpful for controlling calorie intake but lacking depth in terms of comprehensive nutritional facts crucial for maintaining a healthy lifestyle.

Moreover, the standard menu presentations fail to account for variations attributable to customizable options, such as differing milk choices, presenting a significant information gap for people seeking to make informed dietary choices.

69 of 170

Solution

In response to this gap, we aim to build a tool for visualizing the nutritional facts of Starbucks drinks to enhance consumer awareness and empower people with the knowledge to make healthier choices.

70 of 170

Hypothesis

  1. There are notable differences in the nutritional content across different categories of drinks.

  • Substituting traditional dairy milk with alternative milk options and variations in sugar levels significantly alters the nutritional profile of a drink, particularly in terms of calorie, fat, and carbohydrate content.

  • Seasonal drinks offer unique nutritional profiles compared to standard menu offerings due to the incorporation of special ingredients.

71 of 170

Exploratory Viz/

About our dataset

72 of 170

73 of 170

74 of 170

75 of 170

76 of 170

Prior Relevant Work

77 of 170

78 of 170

79 of 170

Final Viz Prototypes

80 of 170

81 of 170

82 of 170

Draft Viz (+code)

83 of 170

84 of 170

Questions

85 of 170

Questions

  1. Any suggestions on how to visualize the decision-making process when ordering a drink?
    1. dropdown menus
    2. select in the tree structure
    3. drag and drop
  2. Would you recommend the food menu as well so that users can form combos and check out the total nutritional profile?
  3. With the dataset, is there any other data story that can be used?

86 of 170

Thank you

87 of 170

88 of 170

01 Mapping Campus Safety

02 US Crime Rates (1975–2015)

03 NYC Bike Ridership Patterns

04 Nutritional Profiles of Starbucks Drinks

05 March Madness

06 Violent Crime & the Subway

07 Representation of Art/Artists at the MoMA

08 Engineering the Perfect Track

09 The Art of Perfecting Sleep

89 of 170

A5.2 Project Proposal

Natalie Bran, Rosa Figueroa, Yueqi Li, Lindsey Cruz Rosales

90 of 170

Motivation

"March Madness showcases the pinnacle of collegiate basketball talent and dedication. Our project aims to explore and highlight this tournament's intricacies, inviting a broader audience to share in the excitement and celebrate the hard work of its teams."

91 of 170

Hypotheses

  • Certain conferences consistently demonstrate higher win rates in March Madness.
  • The average seed rankings within certain conferences will be significantly higher, suggesting these conferences are perceived to have stronger teams overall.
  • There is a strong positive correlation between higher seed rankings and a team's win rate and likelihood of advancing in the tournament, indicating that seedings are a reliable predictor of tournament success.
  • Certain seeds are more prone to being involved in upsets either as the unexpected victor or as the higher seed that gets defeated, which could challenge the reliability of seeding as a predictor of success.
  • Testing these hypotheses with the data can lead to insightful conclusions about the nature of competition in March Madness, the effectiveness of the seeding process, and the dynamics of success within and across conferences.

92 of 170

Relevant Prior Work

Display of winning schools over the years including their seeds and number of times won

Visualizes the Men's and Women's NCAA Tournaments as radial brackets

93 of 170

Relevant Prior Work (Cont.)

Adam Pearce’s work displays how NBA playoff probabilities shifted since the beginning of the 2019-2020 tournament

The Pudding’s essay on how artists get paid from streamed – Inspiration for scrollable, animated essay style

94 of 170

Exploratory Visualization

Number of championship wins of each conference

The average ranked seed for teams in each conference

Conference Results

95 of 170

Exploratory Visualization

Wins and Performance Against Seed Expectations by Seed

Total non-zero wins by team by seed

Upset and Seed Analysis

96 of 170

Exploratory Visualization

Closer look at seed 11 performance by teams ranked seed no. 11

Frequency of upsets for certain winning seeds and seed matchups

Number of Upsets from 2008-2023 by Unique Seed Match-Ups

Number of Upsets from 2008-2023 by Winning Seed

Upset and Seed Analysis

97 of 170

Rough D3 Graph

Breakdown of championship conferences and teams (2008-2023)

98 of 170

Rough D3 Graphs

Interactive visualization that shows 20 years of basketball winners

Bubble chart linked to line graph that shows the relationship between conferences, teams, and seeds in the tournament from 2008-2023

99 of 170

Rough D3 Graph

Search function across 2008-2023 championship data

100 of 170

Rough D3 Graph

101 of 170

“March Madness thrives on its thrilling unpredictability and historic upsets, and our data story reveals how some conferences consistently outshine the rest, while pinpointing the most astonishing upsets ever witnessed in tournament history.”

Our Data Story:

102 of 170

Sketches

103 of 170

Sketch

104 of 170

Additional Idea!

105 of 170

Feedback Questions

  • In regards to March Madness, what aspects specifically spark your curiosity?
    • Is there any area you think could be interesting to explore?
  • Did our story flow logically for you, and did it make sense in its progression?
    • Is there any aspect of our narrative you think could be improved or altered for better understanding?
  • Do you envision alternative ways to represent this data that could enhance understanding or engagement?
    • Do you have any suggestions or ideas for interactive elements you'd like to see?

106 of 170

Thank You!

107 of 170

108 of 170

01 Mapping Campus Safety

02 US Crime Rates (1975–2015)

03 NYC Bike Ridership Patterns

04 Nutritional Profiles of Starbucks Drinks

05 March Madness

06 Violent Crime & the Subway

07 Representation of Art/Artists at the MoMA

08 Engineering the Perfect Track

09 The Art of Perfecting Sleep

109 of 170

A5 In-Progress

Geneva Ng, Aparna Rajesh, Sam Miserendino

110 of 170

Motivation and Hypothesis

We are concerned about the safety of public transportation in New York City and hypothesize that there have been elevated levels of violent crime in the subway in the city over the last 6 months.

We're investigating whether:

  • this rise marks a record high or a recent peak
  • such crimes are increasing in isolation
  • there's a general uptick in violent crimes citywide during this period

111 of 170

Prior Work

Anton Bardera’s Multidimensional Visualization: Linked Views will inform the setup of our linked view, especially in connecting two views, with Chazan’s work further guiding the presentation of linked data.

In the center, @kiko-datasparq’s Pricing Algorithm visualization will guide our main visualization, offering an easy comparison of two data sets on the same axes.

Adam Chazan’s MD Countries Total Cases Map will inspire our linked view opacity map.

112 of 170

Exploratory Viz 1

113 of 170

Exploratory Viz 2

114 of 170

Exploratory Viz 3

115 of 170

Our Rough D3 Viz

116 of 170

Final Viz Ideas

117 of 170

Final Viz Ideas

118 of 170

Feedback Questions

  1. Are any other ways that we can possibly combine or pre-compute the values in these visualizations to make the default view more comprehensive?

  • Are there other data sources that would be better / limitations in official reporting data like this that need to be considered? It seems like the authoritative and only dataset you’d need at first, but this could be a bad assumption.

  • (on the next slide)

119 of 170

Feedback Questions

Are there any dimensions to the data that these columns would make you curious about? Aka are we missing any interesting data you’d want to see?

120 of 170

121 of 170

01 Mapping Campus Safety

02 US Crime Rates (1975–2015)

03 NYC Bike Ridership Patterns

04 Nutritional Profiles of Starbucks Drinks

05 March Madness

06 Violent Crime & the Subway

07 Representation of Art/Artists at the MoMA

08 Engineering the Perfect Track

09 The Art of Perfecting Sleep

122 of 170

A representation of art and artists at the MoMA

Olivia Schmitt, Lucia Perez Saignac, Racquel Lemoine

123 of 170

Motivation and Hypotheses

The representation of different genders and ethnicities within the MoMA’s collection raises important questions about diversity and inclusivity within the art world. This proposal aims to investigate the representation of artists in MoMA's collection, focusing on gender and ethnicity, and how these have changed over time.

  • Are certain genders or ethnicities underrepresented in MoMA's collection? How has this representation evolved over time?
  • How has the scope of MoMA's collection changed over time? Which periods witnessed the most significant growth or decline?

124 of 170

Relevant Prior Works

Prior work by FiveThirtyEight has investigated different factors related to the changing composition of the Moma’s collection over time, such as the painting size, year painted, as well as artist’s nationality and choice of medium. We want to expand on this work by exploring the intersections of these factors, specifically as they relate to the identity of the artist. Thus, an important aspect of our data visualization is leading the reader to discover new artists as they uncover their own trends and gaps within the data.

125 of 170

Relevant Prior Works - cont.

126 of 170

Exploratory Visualizations - I

127 of 170

Exploratory Visualizations - II

128 of 170

Exploratory Visualizations - III

129 of 170

Exploratory Visualizations - IV

130 of 170

Sketches for Final Viz

131 of 170

Sketches for Final Viz

132 of 170

What do we want our readers to take away?

Data Story

Readers should walk away with an awareness of the lack of diversity at the MoMA, an understanding that while there is some type of initiative to increase representation, it is still very much lacking.

133 of 170

Initial Visualization

Python Generated Visualization using Jupyter Notebook

134 of 170

Feedback: Specific Questions

Scaling

Interpretation

Engagement

Are there any common difficulties that are faced when it comes to analyzing gender and heritage that we should be aware of? What ethical considerations should we keep in mind?

Considering there is a large range in our data (male vs non male artists) it is difficult to display these data points without overpowering the smaller values. What type of visualizations would you recommend to combat this?

What strategies would you recommend to appeal to a wide range of people and not just those involved in the art world?

01

02

03

135 of 170

Thank You!

Please keep this slide for attribution

CREDITS: This presentation template was created by Slidesgo, and includes icons by Flaticon and infographics & images by Freepik

136 of 170

137 of 170

01 Mapping Campus Safety

02 US Crime Rates (1975–2015)

03 NYC Bike Ridership Patterns

04 Nutritional Profiles of Starbucks Drinks

05 March Madness

06 Violent Crime & the Subway

07 Representation of Art/Artists at the MoMA

08 Engineering the Perfect Track

09 The Art of Perfecting Sleep

138 of 170

Is there a way to

engineer a top hit?

Arthi Krishna, Rachel Michaelson, Elaine Su, Abhishek Chaudhary

139 of 170

Why this topic?

Motivation & Impact

As avid music listeners, we were curious as to if there were reasons why certain songs were hitting the top charts and what that said about people’s music taste through the decades.

We aim to provide a comprehensive exploration of what makes a song successful,

offering valuable insights for music industry professionals and enthusiasts.

140 of 170

Our Dataset

Top 10,000 Songs on Spotify 1960-2023 (from Kaggle)

Songs (according to Spotify API) are evaluated by several metrics, including

Danceability

Acousticness

Instrumentalness

Loudness

Valence

Speechiness

Liveness

Energy

141 of 170

DataViz Explorations

Analysis Findings

142 of 170

DataViz Explorations Continued

Analysis Findings

143 of 170

Technical Visualization

Test Created in D3

144 of 170

Lo-Fi UX

Sketches

145 of 170

Hi-Fi UX Mockups

Intro + Exposition

146 of 170

Hi-Fi UX Mockups

Decade Timeline

147 of 170

Hi-Fi UX Mockups

Final Personalized Exploration

148 of 170

Feedback Questions

What do we need help on?

Best ways for multi-view coordination

Emphasizing patterns in the data

Overall planned user experience & interactiveness

149 of 170

Thank You!

Any Questions?

150 of 170

151 of 170

01 Mapping Campus Safety

02 US Crime Rates (1975–2015)

03 NYC Bike Ridership Patterns

04 Nutritional Profiles of Starbucks Drinks

05 March Madness

06 Violent Crime & the Subway

07 Representation of Art/Artists at the MoMA

08 Engineering the Perfect Track

09 The Art of Perfecting Sleep

152 of 170

Slumber Stats:

Plotting Your Path to Quality Sleep

By Emily Xia, Chengke Deng, Jyothi Gandi, Kentrie Tran

153 of 170

What is one experience everyone shares and craves?

The Answer: A Good Night’s Sleep

What does it mean to have a good night of sleep?

  • Sleep Quality (user-measured)
  • Sleep Duration
  • Sleep Efficiency (amount of time spent in bed)

How can we sleep better?

  • Narrative focused on YOUR sleep routine and how it compares to the data
  • “Choose Your Own Adventure” style exploration

154 of 170

Our Inspiration

01

“The Secrets to Good Sleep” (NYT)

Storytelling approach to presenting sleep

02

“Sleep Better at Every Age” (NYT)

Clickable features that customizes the content for

YOUR age group

03

“Relationships between sleep efficiency and lifestyle…” (Yu Ikeda, et al.)

Why sleep is important + sleep factors we also want to explore

155 of 170

156 of 170

157 of 170

158 of 170

Relationship between sleep quality and lifestyle, particularly exercise, stress, and occupation.

Sleep, Health and Lifestyle

Does phone usage 30 min before bedtime affects sleep quality?

Sleep and Phonetime

The influence of alcohol, smoking, and caffeine on sleep

Sleep Efficiency

Datasets

159 of 170

Hypothesis and Questions

Optimize Sleep

How can we optimize our sleep, both in terms of duration and quality?

1

Factors that Influence Sleep

What routine adjustments can significantly improve sleep quality?

2

Demographics on Sleep

Are there significant variations in sleep duration and sleep quality among different ages, genders, or occupations?

3

160 of 170

Sleep Factor Correlation

161 of 170

162 of 170

163 of 170

164 of 170

165 of 170

"Your sleep journey is unique and shaped by your habits - from phone usage to alcohol consumption, the quality and duration of your restful nights depends on YOU."

166 of 170

167 of 170

168 of 170

Feedback Questions

We are using multiple datasets in our project. Are there any specific methods or practices we should consider to effectively integrate and present data from different sources?

1

How can we craft a compelling, personalized narrative that facilitates user explorations and for them to craft their own conclusions and learning?

2

Is it confusing to have too many interactive visualizations? Should we have more static visualizations, or a “main interactive visualization” for a clearer narrative?

3

169 of 170

THANK YOU FOR LISTENING!

170 of 170