Remixing Writing Pedagogies
Coding as Writing with Data
What is our discipline’s role in teaching and advocating for critical approaches to coding and data practices?�
Coding as Writing with Data Programmatically
“... take seriously a new set of responsibilities to teach machines what to do and what not to do with powerful rhetorical strategies … [and] stay involved in the kind of work that mobilizes our disciplinary knowledge” (p. 253).
Hart-Davidson (2018)
Objectives
Demonstrate implications of different perspective
Background of UG Course Designed with Programmatic Writing with Data
Phase 1 - Course Weeks 1-4
Read & Discuss Ethical Concepts & Theories of Data/ML:
Phase 1 - Topics & Readings Used
Week #1 - Social Justice vs. Ethic of Expediency
Week #2 - Defining AI/ML and the human decisions that shape them
Week #3 - How data are always situated and contextual?
Week #4 - What are some existing ethical approaches and ideas in AI/ML?
Phase 2 - Course Weeks 5-9
Learn & Practice Python Language Fundamentals:
Phase 3 - Course Weeks 10-12
Process, Analyze and Create a ML (Logistic Regression) Model:
Phase 3 - Course Weeks 10-12
Process, Analyze and Create a ML (Logistic Regression) Model:
Phase 3 - Course Weeks 10-12
Process, Analyze and Create a ML (Logistic Regression) Model:
FINAL PROJECT - Course Weeks 13-Finals
_Team Project_ to Develop & Interrogate Their Own ML Model:�In teams, students …
Logistic Regression Primer
Rules & decisions on/about data
(Digital) Data
Output / Answer
Review: Primer on how ML works
Rules on/about data
(Digital) Data
Outputs/Answers
Traditional Programming
Machine Learning
Logistic Regression - What is it?
Buy A Coffee/Espresso = p(morning|tired|stressed) * mornings
55.75 days = p(0.15) * 365
= p(evening|refreshed|caffeinated) * 365
3.65 days = p(0.01) * 365
Logistic Regression - What is it?
{
The Most
Uniquely Popular Halloween Candy In Each U.S. State
Food & Drink
Dependent
Variable
{
Independent
Variables
???
Logistic Regression - What is it?
{
President Biden announces his favorite Halloween candy before U.S. Congress
POLITICS
Dependent
Variable
{
Independent
Variables
0.990
Logistic Regression == Discrete classification
1
0
Logistic Regression - Nuancing discrete classification
Enter Stochastic Models: A conditional distribution of probabilities
0.5
P = w1•x1 + w2•x2 + w3•x3 + … wk•xk + b
P(News Genre) = w1•(halloween) + w2•(candy) + w3•(popular) + … wk•xk + b
Words (features)
Probability Estimate
1.0
Logistic Regression - Goods & Caveats
Activity Time!
EDA
get(notebook)
Activity Time!
EDA - Estimates of Location
Measure | Definition | Sensitive to Outliers? | Best Use Cases |
Mean | Average of all values | Yes | Symmetrical data, all values important |
Median | Middle value | No | Skewed data, presence of outliers |
Mode | Most frequent value | No | Categorical data, identifying common values |
Measures the central tendency of a dataset.
EDA - Estimates of Distribution
Values that describe the spread, variability, or dispersion of data points in a dataset.
Measure | Definition | Sensitive to Outliers? | Focus |
Range | Min - Max | Yes | Entire dataset extent |
Variance | Average squared deviation from the mean | Yes | Spread around the mean |
Standard Deviation | Square root of variance | Yes | Typical deviation from the mean |
2.3.1 Exercise
Observations About
the ‘headline’ Column
Observation 1
Observation 2
Observation 3
2.4.1 Exercise
Observations About
the ‘date’ Column
Fewer 2018 articles compared to the rest of the dataset.
Observation 2
Observation 3
2.7.2 Exercise
Observations About
the ‘category’ Column
Observation 1
Observation 2
Observation 3
2.8.1 Exercise
Observations About the
short descriptions Columns
Observation 1
Observation 2
Observation 3
EDA & Model Accuracy
Observations Comparing EDA Work Against the LR Model
(See the sections 4.4.3 and on …)
Observation 1
Observation 2
Observation 3