An Approach to Integrating Data Science into multiple STEM+C undergraduate courses
This project is supported by NSF grants #1029711, #1915487, and #1915268.
Introductions: Us
Vanderbilt University (VU)
Virginia Tech (VT)
(Engineering Education)
Partner to Improve
North Carolina Agricultural and Technical State University (NC A&T)
Introductions: Us
Current Graduate students:
Past Graduate students:
Introductions: You
Agenda
Project Overview
“To prepare their graduates for this new data-driven era, academic institutions should encourage the development of a basic understanding of data science in all undergraduates.” - National Academy of Sciences report 2018
Cross-Organization Interdisciplinary Collaboration
Benefits of cross-organizational collaboration
Challenges of cross-organizational collaboration
Panel:
Dr. Kang Xia, Dr. Gautam Biswas, Dr. Manoj Jha & Dr. Steven Jiang
Brief introduction to each course the modules were developed for
Questions to
the panel facilitated by
Chris Vanags
Questions to
the panel by
workshop attendees
ISEN 370 Engineering Statistics
Monitoring and Analysis of the Environment
●Senior level required course for the Environmental Science Major at Virginia Tech
●30-40 students/spring semester
●Water quality of a local stream and soil quality of an agricultural field
●Student-driven hands-on projects and LEWAS high-frequency datasets
●Data-based assessment/decision/communication
●Essential skill example: use Variation in Measured Data to assess and communicate environmental sample test findings
●Major challenge: minimum training in basic statistics
Engineering Hydrology
Real-time high-frequency datasets from a field setup on VT campus
Smart Cities
Confidence Intervals and Visualization Module
Students demonstrate ability to use data science tools by building a 95% confidence interval around the Iris dataset.
Supervised Learning Module
Students use supervised machine learning algorithms, including linear regression, lasso regression and support vector machine classification, on two different datasets.
Clustering Module
Students implement unsupervised machine learning algorithms including k-means, agglomerative hierarchical and DBSCAN density-based cluster algorithms on two datasets.
Website & Module Structure Overview: DS4STEM.org
Data Visualization Activity
Explore one
of the two tasks
Think
Pair
Share
How did data science support learning in the case you worked on?
How can data visualization support learning in a STEM class you currently teach?
What would your students need to know and be able to do in order to be successful data science?
Research Findings
Key Considerations & Suggestions for Integrating data science
Key Considerations for Integrating data science
3. Giving students constructive feedback
4. Instructor’s comfort level teaching data science
Student Background
Student Perspectives (cont.)
Student reported change of belief and interest in data science
Small Table Discussions
What questions do you have about integrating data science into your course?
What challenges have you faced or anticipate facing when integrating data science?
Brainstorm solutions to possible challenges