Welcome to NCSSM-Morganton!
Data Science Summer Institute 2023
Taylor Gibson�Dean of Data Science and �Interdisciplinary Initiatives
I’m the Dean of Data Science at NCSSM.
My background is in teaching math & computer science. I also have a degree in biomedical engineering with a focus in neuroengineering 🧠
I worked to develop the data science program (with a lot of help) at NCSSM.
I also maintain NCSSM's Jupyter cloud infrastructure.
You can reach me at gibson@ncssm.edu
Hello, my name is Taylor Gibson 👋
Goals for the week
Who is in the room?
Plan for the week
Monday | Meet one another, collect some data, get settled! |
Tuesday�-�Thursday | Mornings Complete classroom-type activities�Afternoons Guest speakers from all over! Happy Hour Reception: Wednesday @ Fonta Flora |
Friday | National Landscape of Data Science Education�Data Science in Industry�Wrapping up |
Learning outcomes
There are a lot of thoughts and opinions on the matter
What is data science?
So, really, what is data science?
Data science
Applications
Implications
Foundations
Updated from Grolemund & Wickham's classic R4DS schematic, envisioned by Dr. Julia Lowndes for her 2019 use R! keynote talk and illustrated by Allison Horst.
Data collection and exploration
Collecting data from the field or using publicly available datasets.
Cleaning and wrangling data so it is properly formatted to facilitate analysis.
Combining multiple datasets into one.
Visualizing data to discover patterns and produce hypotheses.
Artwork by @allison_horst
Inference and simulation
Quantify if your result is significant or more likely to be due to random chance.
Perform hypothesis testing.
The primary tool we have is randomization, which programming can make trivial.
Artwork by @allison_horst
Prediction and classification
Making informed, quantitative guesses.
Techniques can include regression and classification.
Introduce students to a discipline called machine learning.
Artwork by @allison_horst
It all comes down to scale.
How is data science different than statistics?
Open science and reproducible research
Scientific results and evidence are strengthened if those results can be replicated and confirmed by several independent researchers.
When researchers properly document and share the data and processes associated with their analyses - the broader research community is able to save valuable time when reproducing or building upon published results.
Examples
Tools are necessary, but by themselves are not data science
Tools of data science
Computational tools
Single Purpose Flexible / Multipurpose
There are many options!
Data science curriculum
High school data science curricula
All great choices with similar learning outcomes, but use different tools, and target different audiences
Working with data transcends disciplines and courses
But, it's more than just a single course
Where else to teach data science?
Math 3
Where else to teach data science?
AP Statistics
Post-exam module to utilize larger real-world datasets��Lab Sciences
Plotting results collected in a chemistry or biology lab.
Performing hypothesis testing using simulation
Engineering / CTE courses
Compare effects of design choices of a paper helicopter
You're already on your way…
How do I learn to teach data science?
Learning the content, tools, and pedagogy 🧑🏫
NC State University: InSTEP (Virtual)�Free personalized professional learning to support teachers and instructional coaches in developing expertise in teaching K-12 statistics and data science.
Questions?
Activity
Wrangling and Tidying Data
Tables and "Tidy Data"
Source�Wickham, Hadley (20 February 2013). "Tidy Data" (PDF). Journal of Statistical Software.
Artwork by @allison_horst
Tables and "Tidy Data"
Artwork by @allison_horst
Tables and "Tidy Data"
Artwork by @allison_horst
Tables and "Tidy Data"
Artwork by @allison_horst
Tables and "Tidy Data"
Artwork by @allison_horst
Tables and "Tidy Data"
Artwork by @allison_horst
Tables and "Tidy Data"
Artwork by @allison_horst
Tables and "Tidy Data"
Name | Code | Area (m2) |
California | CA | 163696 |
Nevada | NV | 110567 |
Label
Column
Row