ECT Lesson Plan: Correlation vs Causation
Lesson plan at a glance...
| In this lesson plan… |
While some patterns may imply causality, they may in fact be unrelated. In this lesson, students will test the strength of a correlation and discern whether or not a law or conclusion can be made based on that correlation. Students will see the threshold commonly accepted for correlating data and test their own assumptions about causation. This lesson will cover the following CT concepts: pattern recognition, pattern generalization, data analysis, and data collection.
Confirm that your computer is on and logged-in | 1 to 3 minutes | |
Confirm that your projector is turned on and is projecting properly | 1 to 4 minutes | |
Confirm that all students’ computers are turned on and logged-in | 3 to 5 minutes | |
Install or navigate to GeoGebra (http://www.geogebra.org/) | 3 to 5 minutes |
20 minutes | |
20 minutes | |
60 to 120 minutes | |
20 to 30 minutes |
Activity Overview: In this activity, students will see if correlation and causation are connected. They will identify situations in which correlation is mistakenly called causation or when correlation implies causation. Students use pattern recognition to find situations in which the data points towards correlation and where the data actually points towards causation. They will use data analysis to distinguish between the two in each situation.
Notes to the Teacher: While correlation does not imply causation, there is a possibility of a connection between the two. It is possible that there is an unknown factor that gives the appearance of causality. One example of this is the placebo effect (http://wikipedia.org/wiki/Placebo) where patients who believe they will get better from taking the medication do in fact get better. This makes it difficult to test how effective the drug actually is. |
Activity: Walk through some examples of where correlation suggests causation (though there may or may not be any):
Ask student the following questions. Q1: Share an example you have heard or make one up where correlation is implying causation. Q2: In how many of the shared examples are there an actual cause-effect? Q3: Flip the examples around. Do the scenarios become more or less probable? Q4: For each of the examples, write a situation (counterexample) that would prove that correlation does not always imply causation. |
Assessment: A1: Answers will vary. A2: Answers will vary. A3:
A4: Answers will vary. |
Activity Overview: In this activity, students will calculate correlation coefficients for various data and explore the effect varying the data points has on the correlation coefficient. Students use visual pattern recognition to understand how strongly data correlates and data analysis to answer questions about the implications of the data.
Notes to the Teacher: If you would like to calculate the correlation coefficient with your students, see here: http://statistics.about.com/od/Descriptive-Statistics/a/How-To-Calculate-The-Correlation-Coefficient.htm. In fact the calculation of r is one that is a perfect example of how computers can take care of a tedious calculation and free up that time to focus on analyzing the data, which is what we truly want to do. This table is subjective and depends on the nature of the data and the method used to collect it. It is reproduced here to give students an opportunity to see how r visually conveys the strength of correlation. There is a cultural association of negative with bad, and students may confuse negative correlation, which means that when x increases y decreases, with weak correlation, which means that the “tightness” of the data is not there. Weak correlation can be either positive or negative. |
Activity: Read the following aloud to students and have them work through the activity. The physical laws and understanding we have of our universe come as a result of analyzing data. Correlation allows us to see how strongly two variables are connected to each other. As mentioned above, this does not necessarily mean that one causes the other. The actual calculation for Correlation (r) is a bit tedious and requires an understanding of other statistical concepts, such as Standard Deviation and summation. In this activity, we will use GeoGebra and the slope-intercept equation for a line to aid us in determining the strength of the relationship between two variables.
Source: Wikipedia (http://wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient) Q1: Create different values for r by moving the points on the graph.
Q2: Visually, what determines how close r is to -1 or 1? Q3: The image below is referred to as Anscombe’s quartet (http://wikipedia.org/wiki/Anscombe%27s_quartet). The images below have the same value for r, and yet the data clearly is saying something different in each graph. How is it possible that these images have the same r? Q4: What is a critical step to avoid data sending conflicting messages? |
Assessment: A1: Answers may vary (examples below):
A2: The direction of the data (negative or positive) and how close the data is to one another.
|
Activity Overview: In this activity, students will explore examples of related variables by choosing and conducting experiments in order to predict and calculate the correlation between the variables. Students use data collection to record data from their chosen experiments, pattern generalization to predict a correlation coefficient and data analysis to calculate a correlation coefficient.
Activity: Have students work through the following activity. Q1: Before you begin to collect data, predict how strong/weak positive/negative you believe the correlation between your two variables will be.
Example:
Q2: What type of data (continuous or discrete) can be visualized in a scatterplot? Q3: Compare your results with your assumptions.
|
Assessment: A1: Answers will vary. A2: DiscreteA3: Answers will vary. |
Activity: In this activity, students will analyze an article’s attempt to explain causation through correlation or they will practice predicting how closely certain data correlates. They will demonstrate pattern recognition and data analysis skills practiced in this lesson.
Notes to the Teacher: Students can be assessed on their ability to predict how closely data correlates (see Correlation Assessment example). |
Activity: Have students analyze an article from the additional resources below and figure out if and where there are attempts to imply causation through correlation. Have them find counterexamples that disprove the implication. |
Learning Objectives | Standards |
LO1: Students will compute (using technology) the correlation coefficient of a linear fit. | Common Core CCSS MATH..CONTENT.HSS.ID.C.9: Distinguish between correlation and causation. Computer Science AUSTRALIA 10.4 (Collecting, managing and analyzing data): Analyse and visualise data to create information and address complex problems; and model processes, entities and their relationships using structured data.
CSTA L3B.CT.9: Analyze data and identify patterns through modeling and simulation. |
LO2: Students will identify the slope of the linear graph as the constant in the relationship y=kx and apply this principle to interpreting graphs constructed from data. | Computer Science AUSTRALIA 10.4 (Collecting, managing and analyzing data)
|
LO3: Students will collect data and determine how strong the correlation is between the variables. | Computer Science CSTA L3B.CT.5: Use data analysis to enhance understanding of complex natural and human systems. |
Term | Definition | For Additional Information |
Correlation | Data that shows a mathematical relationship between two variables (e.g. as the temperature of a system increases, the pressure tends to increase as well). | |
Causation | The implication that B follows A. (E.g. Taking this medicine will make me feel better). | http://en.wikipedia.org/wiki/Correlation_does_not_imply_causation |
Concept | Definition | |
Pattern Recognition | Observing patterns and regularities in data | |
Pattern Generalization | Creating models of observed patterns to test predicted outcomes | |
Data Analysis | Making sense of data by finding patterns or developing insights | |
Data Collection | Gathering information | |
Contact info | For more info about Exploring Computational Thinking (ECT), visit the ECT website (g.co/exploringCT) |
Credits | Developed by the Exploring Computational Thinking team at Google and reviewed by K-12 educators from around the world. |
Last updated on | 07/02/2015 |
Copyright info | Except as otherwise noted, the content of this document is licensed under the Creative Commons Attribution 4.0 International License, and code samples are licensed under the Apache 2.0 License. |
ECT: Correlation vs Causation of