Scatter Plots
Grab a warm-up off the wooden desk and get started! :-)
Goals:
Warm-up #1
What do you notice?
What do you wonder?
Compare Kobe Bryant and LeBron James. Who has the better average points scored per game?
Warm-up #2
What do you notice?
What do you wonder?
Key Concepts
A scatter plot is a __________ that shows the relationship between two _____________ _________. The two data sets are graphed as _________ ______ in a coordinate plane. Scatter plots can show trends in data.
Correlation Coefficient
A measure of how well a regression fits a set of data is called ____________________. The ___________________ __________________ is a value between -1 and 1 which indicates how close the data are to the graph of the regression equation. The closer the correlation coefficient is to 1 or -1, the _______________ the relationship is between the two variables. The closer the correlation coefficient is to zero the ______________ the relationship between the two variables. The variable r is used to represent the correlation coefficient.
Correlation Spectrum
-1
0
1
Perfect Negative Correlation
Perfect Positive Correlation
No
Correlation
Positive, negative, or zero correlation?
Read the statement and decide if you agree.
The number of smartphones sold in the US has increased every year since 2005. The number of flat screen tv’s sold in the U.S. has also increased during the same period of time.
Therefore owning a cell phone causes a person to buy a flat screen tv.
Read the statement and decide if you agree.
Since 2004, the average salary of an NFL football player has increased every year. The average weight of an NFL player has also increased yearly since 2004.
Therefore, the higher salaries cause the players to gain weight.
Read the statement and decide if you agree.
Worldwide, the number of automobiles sold annually has steadily increased since 1920. Gasoline production has also steadily increased since 1920.
Therefore the increase in the number of automobiles sold caused an increase in the amount of gasoline produced.
Correlation vs. causation
When a change in one variable causes a change in another variable, it is called ___________________. Causation produces a strong correlation between the two variables. The converse is not true. In other words, correlation does not imply causation.
A correlation is a necessary condition for causation, but a correlation is not a ______________ _______________ for causation.
There are two relationships that are often mistaken for causation. A common response is when some other reason may cause the same result. A confounding variable is when there are other variables that are unknown or unobserved.
Using a Line of Fit to Model Data
Sketch an estimated linear regression for each scatterplot
Ti-84 Calculator Steps
2nd → 0 → DiagnosticOn
STAT → EDIT → Enter L1 & L2
Clear: Arrow UP → Clear → Enter
STAT → CALC → 4:LinReg
Notice: the coefficient of determination is also shown!
Challenge: What is the correlation of day-month of birthdays in this class?
Day | Month |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
Day | Month |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
Day | Month |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
Calculating correlation using technology
y=ax+b
a=.609633
b=6.11359
r^2=0.4196
r=0.6478
2nd → Catalog “0” → Arrow down → DiagonosticOn
Desmos.com/calculator
Notice: the coefficient of determination is also shown!
Log on to Student.desmos.com
Time for some extra practice?
Ixl.com → Join the Jam!
Printed worksheet.
TI-84 vs. Desmos.com
Tell me What you Learned Today!
_
Resources
Mod 3 Standards
�S.ID.1 Represent data with plots on the real number line (dot plots, histograms, and box plots) in the context of real-world applications using the GAISE model.
S.ID.2 In the context of real-world applications by using the GIASE model, use statistics appropriate to the shape of the data distribution to compare center (median and mean) and spread (mean absolute deviation, interquartile range, and standard deviation) of two or more different data sets.
S.ID.3 In the context of real-world applications by using the GAISE model, interpret differences in shape, center, and spread in the context of the data sets, accounting for possible effects of extreme data points (outliers).
S.ID.5 Summarize categorical data for two categories in two-way frequency tables. Interpret relative frequencies in the context of the data (including joint marginal, and conditional relative frequencies). Recognize possible associations and trends in the data.
S.ID.6 Represent data on two quantitative variables on a scatter plot, and describe how the variables are related.�C. Fit a linear function for a scatterplot that suggests a linear association
S.ID.7 Interpret the slope (rate of change) and the intercept (constant term) of a linear model in the context of the data.
S.ID.8 Compute (using technology) and interpret the correlation coefficient of a linear fit.
Note: “The GAISE model of statistical problem solving consists of four steps: Formulating Questions, Collecting Data, Analyzing Data, and Interpreting Results. They also summarize data sets using mean absolute deviation.”