1 of 36

Identification, Exploration, and Remediation: Can Teachers Predict Common Wrong Answers?*

Ashish Gurung1, Sami Baral1, Kirk P. Vanacore1, Andrew A. McReynolds1, �Hilary Kreisberg2, Anthony F. Botelho3, Stacy T. Shaw1, Neil T. Heffernan1 �WPI1, Lesley2, UF3

1

*Funded in part by NSF CSSI grant awarded to Neil T. Heffernan and Ryan S. Baker

2 of 36

Common Wrong Answers(CWAs)

Defining Common Wrong Answers:

  • Incorrect Attempts that commonly occur are considered CWAs
  • Bugs (VanLehn[2, 4], Brown[1], and Sison[5])
  • Diagnostic model (Brown et al. [1], Selent [8])
  • Repair Theory (Brown and VanLehn[3])
  • Feedbacks can be beneficial (Narciss[6, 7])

Important to note:� Proactive approach when predicting the CWAs

2

3 of 36

CWAs

Example of CWAs

  • How are CWAs established?�analyze -> create diagnostic model -> generate templates -> generate feedback

3

Automated & Immediate

4 of 36

Two Mastery Based Activities

4

Order of Operations (Experiment 1)

2 Step Equation (Experiment 2)

5 of 36

Research Questions

  1. Can Teachers and Instructional designers identify CWAs on math problems?
  2. Does receiving CWAFs improve short-term learning outcomes?
  3. Do high- and low-performing students benefit differently from CWAFs?

5

6 of 36

RQ1 Proactively Identified CWAs

6

Table 1: Total CWAFs generated across two problem sets.

  • Teachers and activity designers analyzed the problems
  • Identified the mechanism that can results in the CWAs
  • Generated the CWAFs to help address the underlying misconception, gap in knowledge or other potential learner behavior (think slip and guess)

7 of 36

Experimental Design

7

Fig: A/B test design

  • Students were asked to work on mastery based assignments
  • N-CCR design (N=3)
  • Daily limit = 10
  • Potential outcomes
    • Mastery
    • Wheel-Spinning [9]
    • Dropout
  • Data was collected across 9 academic years 13-14 to summer 22

8 of 36

Experimental Descriptives

8

Table 4: Descriptives of the A/B test exploring the effectiveness of CWAs in two activities.

9 of 36

RQ1 Observed CWAs (N ≥ 10)

  • Most of the CWAs were identified by the teachers if we consider 10 to be the cutoff point.

9

Table 3: Observed CWAs when any incorrect attempt with n ≥ 10 is considered a CWA.

10 of 36

RQ1 Observed CWAs (N ≥ 10)

10

Let us look at the frequency of individual CWAs that occur 10 or more times.

127 incorrect/problem

11 of 36

11

RQ2 Effects of CWAFs on Learning Outcomes

12 of 36

Experimental Design

12

2 mastery based Activities:

  • Order of Operations
  • 2-Step Equations

Show of Hands:

Do you think CWAFs are a good idea?

13 of 36

RQ2 Effects on Mastery

13

Fig: Comparison of student performance in the treatment and control(origin) for both activities

logit(mastery ~ Z + (1|teacher))

14 of 36

RQ2 Effects on Mastery

14

Fig: Comparison of student performance across conditions for Two-Step Equations

15 of 36

RQ2 Effects on Mastery

15

Fig: Comparison of student performance across conditions for Order of Operations

16 of 36

RQ2 Effects on Mastery

16

Fig: Comparison of student performance across conditions for both activities

17 of 36

RQ2 Effects on Wheel-Spinning

17

Fig: Comparison of student performance across conditions for both activities

logit(wheel-spinning ~ Z + (1|teacher))

18 of 36

RQ2 Effects on Wheel-Spinning

18

Fig: Comparison of student performance across conditions for Two-Step Equations

19 of 36

RQ2 Effects on Wheel-Spinning

19

Fig: Comparison of student performance across conditions for Order of Operations

20 of 36

RQ2 Effects on Wheel-Spinning

20

Fig: Comparison of student performance in the treatment and control(origin)

21 of 36

RQ3 Exploring effects of CWAFs on high and low performing students.

21

22 of 36

RQ3 Exploring Potential HTE of CWAFs

22

Table 7: logit(mastery ~ Z * prior performance + (1|teacher)) �Z is control or treatment assignment

23 of 36

RQ3 Exploring Potential HTE of CWAFs

23

Figure: Exploring HTE of receiving CWAFs on students with different prior performance when working on the Order of Operations activity

24 of 36

Limitations & Future Work

  • Automated nature of CWAFs
    • Learner Autonomy
    • Prior performance or Knowledge Tracing based heuristic �
  • Estimation of Student Effort on the Feedback using response time decomposition[10]
  • Causal Forest for Heterogeneous Treatment Effects(HTE)[11, 12]
    • Further expand on the treatment effects of CWAFs
    • Gain insight into various factors that influenced the effectiveness of CWAFs
  • Analysis of CWAs and generation of CWAFs using Large Language Models.

Students can be more sensitive to Common Wrong Answer Feedback unlike other feedback(Hints and Explanations) and can lead to unforeseen learning outcomes.

24

25 of 36

Takeaways

  • Proactively predicting CWAs and generating CWAFs is not the optimal approach
  • Possible alternative: user historical data to establish the more common CWAs
    • Data Driven
    • Reduces total CWAs
    • Ranks CWAs for prioritization�
  • CWAFs can hurt students
    • We really need to be careful when designing CWAFs
  • Additional work is required to establish the various factors that influenced the negative effect of CWAFs on student learning outcomes.

25

26 of 36

References

  1. John Seely Brown and Richard R Burton. 1978. Diagnostic models for procedural bugs in basic mathematical skills. Cognitive science 2, 2 (1978), 155–192
  2. Kurt VanLehn. 1982. Bugs are not enough: Empirical studies of bugs, impasses and repairs in procedural skills. The Journal of Mathematical Behavior (1982).
  3. John Seely Brown and Kurt VanLehn. 1980. Repair theory: A generative theory of bugs in procedural skills. Cognitive science 4, 4 (1980), 379–426.
  4. Kurt VanLehn, Stephanie Siler, Charles Murray, Takashi Yamauchi, and William B Baggett. 2003. Why do only some events cause learning during human tutoring? Cognition and Instruction 21, 3 (2003), 209–249.
  5. Raymund Sison and Masamichi Shimura. 1998. Student modeling and machine learning. International Journal of Artificial Intelligence in Education (IJAIED) 9 (1998), 128–158.
  6. Susanne Narciss. 2004. The impact of informative tutoring feedback and self-efficacy on motivation and achievement in concept learning. Experimental psychology 51, 3 (2004), 214.
  7. Susanne Narciss. 2013. Designing and evaluating tutoring feedback strategies for digital learning. Digital Education Review 23 (2013), 7–26.
  8. Douglas Selent and Neil Heffernan. 2014. Reducing student hint use by creating buggy messages from machine learned incorrect processes. In International conference on intelligent tutoring systems. Springer, 674–675.
  9. Joseph E Beck and Yue Gong. 2013. Wheel-spinning: Students who fail to master a skill. In International conference on artificial intelligence in education. Springer, 431–440
  10. Gurung, A., Botelho, A. F., & Heffernan, N. T. (2021, April). Examining student effort on help through response time decomposition. In LAK21: 11th International Learning Analytics and Knowledge Conference (pp. 292-301).
  11. Stefan Wager and Susan Athey. 2018. Estimation and inference of heterogeneous treatment effects using random forests. J. Amer. Statist. Assoc. 113, 523 (2018), 1228–1242.
  12. P Richard Hahn, Jared S Murray, and Carlos M Carvalho. 2020. Bayesian regression tree models for causal inference: Regularization, confounding, and heterogeneous effects (with discussion). Bayesian Analysis 15, 3 (2020), 965–1056.

26

27 of 36

27

Helpful:

(G1) agree= 27, disagree=5

(G2) agree= 32, disagree= 1

Supportive:

(G1) agree= 3, disagree=29

(G2) agree= 38, disagree= 0

28 of 36

28

29 of 36

Thank You

Project GitHub

Webpage�agurung@wpi.edu

29

30 of 36

30

31 of 36

31

32 of 36

RQ1 Observed CWAs (N ≥ 5)

  • More unidentified CWAs in 2-Step Equation

32

Table 2: Observed CWAs when any incorrect attempt with n ≥ 5 is considered a CWA.

33 of 36

RQ1 Observed CWAs (N ≥ 5)

33

Let us look at the frequency of individual CWAs that occur 5 or more times.

34 of 36

RQ2 Effects of CWAFs on Learning Outcomes

34

Table 5: logit(mastery ~ Z + (1|teacher)) | Z is control or treatment assignment

35 of 36

RQ2 Effects of CWAFs on Learning Outcomes

35

Table 6: logit(wheel-spin ~ Z + (1|teacher)) | Z is control or treatment assignment

36 of 36

RQ2 Effects of CWAFs on Learning Outcomes

36

Fig: Comparison of student performance in the CWAF and the control condition