1 of 28

RPPL Playbook

2 of 28

Purpose & Audience

The RPPL Playbook describes our emerging thinking on different types of studies RPPL will engage in together. It provides guardrails for anchor studies, shared micro-studies, and contributing studies and will form the basis for assessing proposed studies for funding.

This includes an assessment of:

  • study learning objectives,
  • rigor of study design and methodology

Study Findings Impact Practice & Field

We Generate Actionable, Generalizable Findings

Build Organizational Capacity for Research

Research is the Way We Do Our Work

Systematize Self Sustaining Ecosystem

We Spread the Word & Invite Others In

3 of 28

Key Terms

4 of 28

Descriptive Analysis

Answers Questions About:

  • Who, what, where, when, and to what extent.
    • Identifies and describes averages, trends, and variation
  • Correlation
    • Examines whether programs that use some practices/behaviors tend to have better outcomes of interest, on average

Contribution:

  • Identify phenomena or patterns that have not previously been recognized and that are relevant for a pressing question (e.g., identifies need, describes common practice)
  • Raise hypotheses about the mechanisms behind causal relationships
  • Can help support creating new measures of key phenomena

Methods: Qualitative methods such as case studies, interviews + thematic analyses, ethnography; quantitative methods such as surveys, correlational analysis of administrative data

5 of 28

Causal Analysis

Answers Questions About:

  • Causation:
    • The impact of PL (as a package) on an outcome
    • The impact of a new PL practice on an outcome

Contribution:

  • To know what works for reaching a specific goal
  • To know what caused a troubling outcome so that it can be stopped

Methods: Whether we can make a causal claim depends on research design. Robust designs include experiments and natural experiments where individuals are assigned to different interventions/treatments based on things outside their control.

6 of 28

Anchor Studies (RPPL Term)

Anchor Studies are major cross-organizational studies designed to gain deeper insight on a key topic of interest in the RPPL learning agenda.

Note: An Anchor Study does not require RPPL organizations to find and develop new partners. (We understand that this is time intensive and limiting, so ideally, your organization can leverage existing partnerships). However, an Anchor Study typically does require a change in standard practice (e.g., developing a second version of a PL or tweaking a program to align with research questions).

Link to Framework

7 of 28

Micro-Studies (RPPL Term)

Shared Micro Studies engage multiple organizations in addressing related questions on a single topic with short-term outcomes that can provide more rapid evidence. A set of small trials can collectively build a powerful knowledge base about a topic.

Shared Micro Studies are comprised of multiple trials.

  • A trial is defined by the randomization of units into different treatment groups, with each treatment group getting a (somewhat) different version of the intervention.
  • Within a single organization, multiple trials may be completed; if so, trials may vary the same dimension or they may vary different dimensions.
  • Collectively across organizations, the expectation is that we will meaningfully learn about 2+ dimensions by having multiple organizations support 4+ trials total.

This Playbook will clarify if the guidelines apply to collective Shared Micro Studies or individual trials, where needed.

Link to Framework

8 of 28

Contributing Studies (RPPL Term)

Contributing Studies are organization-driven studies to take on a question tied to the learning agenda that is of interest both to the organization and to the broader RPPL membership.

Note: Contributing Studies do not need to be new in scope for RPPL organizations. You can definitely receive funding for existing work! RPPL aims to help each organization develop its capacity for high quality research and will meet each organization where it is in its development.

Link to Framework

9 of 28

Supporting Studies (RPPL Term)

RPPL funds 3 types of studies (Anchor Studies, Shared Micro Studies, & Contributing Studies, see framework) using the guardrails in this RPPL Playbook, however:

  • Organizations will have many other studies launching or ongoing that are not funded by RPPL, which we’ll refer to as Supporting Studies.

  • Within RPPL, Supporting Studies may still be a part of our collective shared learning, eligible for technical assistance, and/or valuable in building knowledge aligned to the learning agenda and organizational capacity for high quality research.

Often, Supporting Studies may serve as the foundation for future RPPL funded studies (e.g., may grow into a Shared Micro Study through participation by more organizations, but may not be large enough to receive funding as a Contributing Study on its own).

10 of 28

Study Selection Guardrails

11 of 28

Framing

The guardrails in this RPPL Playbook are ones we’ll build towards together over time. Year over year, we will build our collective capacity for high-quality research within RPPL organizations.

The nature of research is that we need to plant many seeds, only some of which will sprout into findings that provide actionable evidence and move the field forward.

We do not intend for these guardrails to be limiting, but instead to articulate the bar for alignment and quality. They set the best conditions for sprouting studies that provide usable findings.

12 of 28

Part 1: Guardrails for Study Learning Objectives

This section contains guardrails around:

  • Alignment to Learning Agenda
  • Value of Contribution to PL Organizations
  • Value of Contribution to Literature

These are to be used as the first screen for potential RPPL study ideas.

13 of 28

Learning Objectives

  • Anchor Studies - Build generalizable knowledge for the field AND respond to organizational dilemmas

  • Shared Micro-Studies - Build short term evidence about questions the field is facing AND organizations are facing; surface areas for further study

  • Contributing Studies [Note 3]- Help organizations build capacity for high quality research OR address critical questions to the organization OR address critical questions to the field

Valuable Contribution to

Orgs

Valuable Contribution to Literature

Alignment to Learning Agenda [Note 1]

YES

YES

YES

YES

YES

YES

or

YES

YES

YES [Note 2]

+

+

+

+

+

Note 1. The Learning Agenda is centered around Teacher Professional Learning. Research questions on other enabling conditions (such as school leadership) to support teacher outcomes are also considered aligned to the Learning Agenda.

Note 2. Depending on organization capacity for high quality research, a contributing study may support development of this capacity, more so than meaningfully contribute to the Learning Agenda (i.e., study design may not be rigorous enough to support causal inferences).

Note 3. Each Contributing Study should have a theory about how it can progress to a more rigorous study or relate to a larger RPPL theme of studies (e.g., another rigorous study has been conducted and a Contributing Study supports implementation of those findings)

14 of 28

Part 2: Guardrails for Study Rigor

This section contains guardrails around:

  • Study Design
  • Study Methodology

which are are evaluated in conjunction with each other.

15 of 28

Rigor of Study Design

Experiments

Support Robust Causal Inferences

May Support Causal Inferences

Do Not Support

Causal Inferences

(5)

Random Assignment; RCT

(4)

Comparison Group Design w Exogenous Assignment

(3) Well-Matched Comparison Group Design

(2)

Designs with Comparison Groups of Unknown Match

(1)

Designs without Comparison Groups

Could Support Causal Inferences

Strong Quasi-

Experimental Design

Descriptive Studies

16 of 28

Rigor of Study Design

Experiments

Strong Quasi-

Experimental Design

Descriptive Studies

(5)

Random Assignment; RCT

(4) Comparison Group Design w Exogenous Assignment

(3) Well-Matched Comparison Group Design

(2)

Designs with Comparison Groups of Unknown Match

(1)

Designs without Comparison Groups

Minimum Rigor for Contributing Studies

Minimum Rigor for Micro-Studies

Minimum Rigor for Anchor Studies

Support Robust Causal Inferences

May Support Causal Inferences

Do Not Support

Causal Inferences

Could Support Causal Inferences

See Appendix for Examples

RPPL also plans to produce guardrails for qualitative methods

17 of 28

Rigor of the Methodology

Evaluated in conjunction with each other to determine rigor*:

  1. Treatment Contrast

  • Sample Size

  • Measurement Type�
  • Measurement Quality �
  • Attrition

(5) Strong

(1) Weak

(5) Large

(1) Small

(5) High

(1) Low

(5) High

(1) Low

(5) Low

(1) High

*All of these are intersecting (e.g., sample size, treatment contrast, measurement quality), but we consider them individually for simplicity.

18 of 28

Rigor of the Methodology

Evaluated in conjunction with each other to determine rigor:

  1. Treatment Contrast

How different are the two treatment conditions we are testing? The larger the difference, the easier it will be to detect effects.

(5) Strong

(1) Weak

This criteria is a subjective assessment.

If treatment contrast is weak but sample size is large and measurement quality is high, this methodology could still support causal inferences in a high quality natural experiment.

19 of 28

Rigor of the Methodology

Evaluated in conjunction with each other to determine rigor:

2. Sample Size

Larger sample sizes ensure that the differences we detect between groups are not just due to chance.

There are important considerations about the level of randomization (school, team, teacher) that vary by context and study design.

(5) Large

(1) Small

This criteria is needed to support sufficient power to detect treatment impacts.

Minimum for

Anchor Studies

75 randomizable units per org

Anticipated for Contributing Studies

75-100 randomizable units

Minimum for

Micro-Studies

50 randomizable units per trial

20 of 28

Rigor of the Methodology

Evaluated in conjunction with each other to determine rigor:

3. Measurement Type

More proximal measures that are easier to collect may be less expensive and more likely to show effects, but more distal measures that reflect practice and impact on students are likely more meaningful.

(5) High

(1) Low

Student Outcomes;

Observational Measures of Instructional Practice

Not Aligned to PL Desired Outcomes

Survey Measures of Instructional Practice or Mindsets/Attitudes

Minimum for

Anchor Studies

Preferred for

Micro-Studies

(3) Medium

Preferred for

Contributing Studies

21 of 28

Rigor of the Methodology

Evaluated in conjunction with each other to determine rigor:

4. Measurement Quality

Are we measuring what we want to be measuring? (reliability & validity)

Note: Reliability and validity can be enhanced by either choosing existing measures or piloting measures prior to use. Scores should show evidence of face validity (e.g., experts agree they will measure intended construct) or construct validity (e.g., via cognitive interviews, factor analyses, correlational analyses). With smaller sample sizes, more reliable measures are needed.

Common measures across organizations, as applicable, would enhance research.

(5) High

(1) Low

Scores show evidence of reliability (e.g., test-retest correlation 0.7 or internal reliability of 0.85); strong face and construct validity

Measures mostly noise (e.g., random answers); or measures a construct other than intended

Score reliability 0.7-0.85; scores have strong face validity and some evidence of construct validity

(3) Medium

Minimum for

All Studies

22 of 28

Rigor of the Methodology

Evaluated in conjunction with each other to determine rigor:

5. Attrition

Do we have the full picture of the outcome? High rates of attrition, and particularly differential attrition between the different treatments, can bias study results. (Differential attrition is the difference in attrition rates between the treatment groups).

Note: This cannot be fully known until after the study is complete. In study design, it should be demonstrated that there is high confidence that outcome measures will be collected for 100% of randomized units.

(5) Low

(1) High

<20% overall and <5% pt differential

>30% overall and >10% pt differential

<25% overall and <7% pt differential

(3) Medium

Source: What Works Clearinghouse Standards Handbook (Pg. 11)

Minimum for

All Studies

23 of 28

Appendix

24 of 28

1 - Designs without Comparison Groups

Example: Measure outcomes of teachers before and after they participate in a PL and see whether the PL improved outcomes; survey teachers in a district about their experiences with curriculum adoption; interview teachers who participated in a PL

Opportunities: Can provide helpful descriptive evidence: interesting patterns across the sample, needs, feedback to inform program design

Constraints: Not able to make any claims about program impact or program effectiveness – without a comparison group, we do not know whether these teachers would have improved without the PL

25 of 28

2 - Designs with Comparison Groups of Unknown Match

Example: Compare outcomes for teachers who participate in a PL to other teachers in the school or district

Opportunities: Quite limited. When coupled with rich data about who participates, can potentially yield some hypotheses about whether a program is effective or not,

Constraints: Not able to make any claims about program impact or program effectiveness – teachers who participate and do not participate in the PL are different in unknown ways, so we do not know that the differences in outcomes come from the PL or not.

26 of 28

3 - Well-Matched Comparison Group Design

Example: Compare outcomes for teachers who participate in a PL with teachers who did not but who are similar on many different characteristics (i.e., well-matched). Comparison can be done via matching or robust regression adjustment.

Opportunities: Surfaces robust hypotheses about program impact for further testing.

Constraints: Even though teachers are similar in ways that we can observe, teachers who participate and do not participate in the PL may well be different in unobserved ways. For example, teachers more motivated to learn might be more likely to participate. As a result, we are not certain that the differences in outcomes come from the PL.

27 of 28

4 - Comparison Group Design with Exogenous Assignment

Example: Districts fund PL opportunities in the lowest-performing schools, so for schools near the cutoff participation in the PL is essentially “as good as random”

Opportunities: Rigorous studies with exogenous assignment can lead to causal inferences about program impact

Constraints: Hard to find opportunities for true exogenous assignment

See additional regression discontinuity example in this guide for states (‘Example 2: Summer PD Academy on Differentiated Instruction’’ beginning on PDF Pg 22).

Source:

Perez-Johnson, Irma, Kirk Walters, Michael Puma and others. ―Evaluating ARRA Programs and Other Educational Reforms: A Guide for States.‖ Resource document developed jointly by The American Institutes for Research and Mathematica Policy Research, Inc. April 2011.

28 of 28

5 - Random Assignment RCT

Example: Schools, grade-level teams, or individual teachers are assigned to receive two different PL opportunities

Opportunities: Clear causal inferences about program impacts

Constraints: Randomization requires additional buy-in