1 of 22

An Approach to Integrating Data Science into multiple STEM+C undergraduate courses

This project is supported by NSF grants #1029711, #1915487, and #1915268.

2 of 22

Introductions: Us

Vanderbilt University (VU)

  • Gautam Biswas (PI) (Computer Science and Engineering)
  • Abhishek Dubey (co-PI) (Computer Science and Engineering)
  • Chris Vanags (co-PI) (Peabody College of Education)
  • Erin Henrick (Department of Leadership, Policy, and Organizations)

Virginia Tech (VT)

  • Vinod Lohani (PI)

(Engineering Education)

  • Kang Xia (acting PI) (Environmental Sci.)
  • Landon Marston (co-PI) (Civil & Environmental Engineering)
  • Randel Dymond (co-PI) (Civil & Environmental Engineering)
  • Erin Hotchkiss (co-PI) (Biology)

Partner to Improve

  • Emily Kern (evaluator)
  • Erin Henrick

North Carolina Agricultural and Technical State University (NC A&T)

  • Manoj K Jha (PI) (Civil Engineering)
  • Steven Jiang (co-PI) (Industrial & Systems Engineering)
  • Eui Park (co-PI) (Industrial & Systems Engineering)
  • Niroj Aryal (co-PI) (Biological Engineering)

3 of 22

Introductions: Us

Current Graduate students:

  • Katherine Pérez Rivera (VT, Department of Biological Sciences)
  • Yunus Naseri (VT, Department of Civil and Environmental Engineering)
  • Habtamu Workneh (NC A&T, Department of Civil, Architectural and Environmental Engineering )
  • Caitlin Snyder (VU, Department of Computer Science)

Past Graduate students:

  • Sambridhi Bhandari (NC A&T)
  • Brendan McLoughlin (VT)
  • Dawit Asamen (NC A&T)
  • Caleb Vatral (VU)

4 of 22

Introductions: You

  • Name
  • Home institution & role at institution
  • Course(s)
  • Why you are interested in data science?
  • What do you want to get out of this workshop?

5 of 22

Agenda

  • Project Overview
  • Cross-Organization Collaboration
  • Panel with Dr. Biswas, Dr. Xia, Dr. Jha and Dr. Jiang
  • Break
  • Website and Module Overview
  • Data Visualization Activity
  • Research Findings
  • Small Table Discussions
  • Closing Remarks

6 of 22

Project Overview

  • Interdisciplinary collaborative effort between VT, NC A&T, VU and Partner to Improve
  • Goal: Develop and implement an interdisciplinary collaborative approach to enable undergraduate students to develop data science expertise
  • STEM+C disciplines: engineering, computer science, environmental science, and biology
  • Focus on integrating data science to solve real-world discipline-specific problems
  • Reach ~ 320 students/year (2020, 2021, 2022)
  • Develop data science modules across disciplines that can support students to use and analyze real world data
  • Evaluate the multi-disciplinary cross-institutional partnership and collaboration

“To prepare their graduates for this new data-driven era, academic institutions should encourage the development of a basic understanding of data science in all undergraduates.” - National Academy of Sciences report 2018

7 of 22

Cross-Organization Interdisciplinary Collaboration

Benefits of cross-organizational collaboration

  • Broader understanding of diverse contexts
    • Collaboration represents a Land-Grant, an HBCU and a Private Institution
    • Across disciplines and and undergraduate levels
    • Different course formats (project oriented-- lecture based instruction)
  • Supports capacity building for instructor and graduate students
    • Access to data science expertise
    • Data collection and analysis expertise

Challenges of cross-organizational collaboration

  • Online
  • Limited opportunities for in-person engagement due to COVID

8 of 22

Panel:

Dr. Kang Xia, Dr. Gautam Biswas, Dr. Manoj Jha & Dr. Steven Jiang

Brief introduction to each course the modules were developed for

Questions to

the panel facilitated by

Chris Vanags

Questions to

the panel by

workshop attendees

9 of 22

ISEN 370 Engineering Statistics

10 of 22

Monitoring and Analysis of the Environment

Senior level required course for the Environmental Science Major at Virginia Tech

30-40 students/spring semester

Water quality of a local stream and soil quality of an agricultural field

Student-driven hands-on projects and LEWAS high-frequency datasets

Data-based assessment/decision/communication

Essential skill example: use Variation in Measured Data to assess and communicate environmental sample test findings

Major challenge: minimum training in basic statistics

11 of 22

Engineering Hydrology

  • Junior level required course for Civil Engineering major
  • 40-45 students in Spring semesters
  • Fundamental course: water resources status, issues, challenges, sustainability case studies; hydrological processes; baseflow separation; rainfall-runoff analysis; flow estimation methods, etc.
  • Involves handling long-term datasets of meteorological parameters, streamflow, and water quality variables.
  • Term Project involves group work for a watershed analysis using high-frequency large datasets of rainfall and resulting runoff for a given event
  • Challenges for improvement:
    • Basic skills in data download, data handling, use of tools, and interpretation skills
    • Use of advanced-level tools
    • Add more data-based analysis

Real-time high-frequency datasets from a field setup on VT campus

12 of 22

Smart Cities

  • University course for juniors and seniors
  • ~20 students with a wide variety of backgrounds
  • Project-based course that integrated technological and socio-economic approaches to challenges facing metropolitan areas experienced unprecedented growth
  • Students learn and apply advanced machine learning and analytics methods to extract relevant information from real-word data to characterize and propose solutions in areas such as transportation, energy and water quality
  • Implemented using Github Classroom and Google Colab

Confidence Intervals and Visualization Module

Students demonstrate ability to use data science tools by building a 95% confidence interval around the Iris dataset.

Supervised Learning Module

Students use supervised machine learning algorithms, including linear regression, lasso regression and support vector machine classification, on two different datasets.

Clustering Module

Students implement unsupervised machine learning algorithms including k-means, agglomerative hierarchical and DBSCAN density-based cluster algorithms on two datasets.

13 of 22

10 Minute Break

DS4STEM.org

14 of 22

Website & Module Structure Overview: DS4STEM.org

15 of 22

Data Visualization Activity

Explore one

of the two tasks

Think

Pair

Share

How did data science support learning in the case you worked on?

How can data visualization support learning in a STEM class you currently teach?

What would your students need to know and be able to do in order to be successful data science?

16 of 22

Research Findings

17 of 22

Key Considerations & Suggestions for Integrating data science

  1. Accommodating a variety of student data science skill sets, including the use of tools
    • Provide resources for students to support their learning
    • Have students help each other
    • Allow students to use whatever tool they like when working with the data

  • Deciding how to make room for data science activities
    • Integrate data science skills into the course content
    • Remove some content that is more memorization-focused

18 of 22

Key Considerations for Integrating data science

3. Giving students constructive feedback

    • Have student work in groups
    • Structure assignments so students get feedback on their work
    • Give in-class quizzes using a poll format

4. Instructor’s comfort level teaching data science

    • Start with topics and tools that feel comfortable
    • If teaching tools or topics go beyond your comfort zone, find resources for support

19 of 22

Student Background

20 of 22

Student Perspectives (cont.)

Student reported change of belief and interest in data science

21 of 22

Small Table Discussions

What questions do you have about integrating data science into your course?

What challenges have you faced or anticipate facing when integrating data science?

Brainstorm solutions to possible challenges

22 of 22

Closing Remarks

Welcome to the DS4STEM Network!

DS4STEM.org