ABCDEFGHIJKLMNOPQRSTUVWXYZ
1
What I didHow it wentThings to improveOutside of class (page numbers are for R version)Other comments
2
Day 1Syllabus and logistics(20-25 minutes); Investigation 0 thru part n, also did one more survey questionPooling results at front of room worked well and then 5 min wrap up on definition on end. Didn't do more with probability but it's there for curious students.Decide whether or not you want them to have access to investigation solutions.Asked them to finish Investigation 0 (enumeration)
3
Day 25-10 minutes on recap and logistics; 5 min on quiz (defn of probability); Investigation 1.1Didn't follow Investigation 1.1 too closely, trying to streamline, but pooled coin tossing results and worked with appletHW expecting them to use RMarkdown but need to give quick overview of R first! (c) should be more about how going to choose between the 2 explanations (and not allowed to study more infants). Maybe keep (d) but drop (e). Then did (f) and (h) . In one class did a contrast of bar graphs and dotplot but forgot in the other. Maybe stop after (l) and don't get into p-value yet? Or better transition on how we are going to measure how unusual it is? Maybe even show them two distributions and have them compare the unusualness of different outcomes to motivate tail prbability/percentile type thinking. Maybe start with (p) tomorrow?
Read technology detour and study conclusions; PP 1.1; preview Inv 1.2
4
Day 3Recap of Inv 1.1 (another short quiz) and did Inv 1.2;A bit too much lecture but had to go thru binomial. Timing ok inc answering a few questions..Jumped to binomial presentation of bob and Tim. Needed to get to null and alt for hw. Having them explain the r code on hw 1 probably too much at this stage. need to give instructions for getting into RWorkspace and setting up file associations on home computerRead on p. 20-22, PP 1.2; Preview Inv 1.3
5
Day 4About 8 min on recap, 10-13 finishing up 1.2, soa good 30 min on 1.3. Most got to the reversing success and failure definitiion, some just starting sample size.Good to have a practice day but they really struggled at the start (good or bad). Mostly didn't know what meant by process probability and had a lot of trouble with defining pi and St. George's vs. national vs. sample statistic. Needed better application of the terminology to a previous problem. Really tried to hammer home the parameter vs. statistic. Also need to talk about "process" a bit more. Showed the R p-value to see wasn't exactly zero and also how it was an exact match when did reverse S/F
Probably should get them all to start with death as success but when I went over the first few questions with them, they had many fewer questions later on (not as interactive). St. George, good opportunity to talk about whether want larger sample or more recent sample etc. Probably should do more to talk about spinners vs. coins once change from .5.
6
7
Day 5Went over Sarah from homework, not sure how worthwhile and took a while (more like 20 minutes, longer in 2nd section but not sure why); Introduced iscambinomtest then talked about sample size impact, ask them to tell me about differences in the two pictures. Got good discussion in one class on shape, center, filled in-ness. Also good time to talk about how p-value has already taken sample size into account (forgot to show ppt with that point in first class)' Inv 1.4 and Inv 1.5(a)-(c)


Good to compare distributions on p. 31: In both classes someone brought up change in vertical axis! Very good opportunity make point about being able to compare tail probabilities. Also good time to talk about how p-value has already taken sample size into account (forgot to show ppt with that point in first class)
With investigation 1.4, better not to do 50/50 case and just ask about 3/4 before see data. Have raise of hands for whether they think 3/4 is too high or too low and get a mix as one motivation for two-sided test
I'm probably not doing a good enough job of building on starting investigations overnight
1.4d went pretty good, second class even offered both approaches (in one section got good discussion on why two tail probs not the same and symmetry and when can assume symmetry etc. and so can multiply by 2.)
good to have them do 1.4 g and h on own; might have been better to stop here, went on to CI, pretty much skipping and b (maybe even go to power next instead of CI??)
probably should have done more to have them predict how p-value will differ and look at both issues
PP 1.4: do the kissing couples majority first as a PP before 1.5? Probably need to do more with process and where the randomness is in the choice made by each observational unit, rather than in the choice of observational units
Review discussions on p. 38-40; PP 1.5, preview Inv 1.6
8
Day 6Immediately did quiz (how state hypotheses, impact of sample size); About 8 minutes on quiz, 9 minutes on review, 23 minutes on 1.5 and 10 minutes on 1.6. Recapped terminology for last 3 minutes. Gave project 1 assignment

Review seemed to go ok and had them get into Rstudio to do both CIs themselves and them compare them, really emphasizing center and spread. Demo worked pretty well (make them predict how distribution would change), now also an applet slider
move slide 11 to after slide 4? Started 1.6 together through (d) and then had them play with applet. Some only got to around (g). Needed to emphasize to follow instructions in text. PP 1.6A, seems like don't need (d) and (e) here, also links to old version of power applet. Need to do better job of transitioning to talking about power, how it's more about study design than analysis.

Complete Decision Table (p. 44); PP 1.6A (p. 44)
9
Day 7Did 20 min overview/recap on 1.6, demonstrating the factors with the applet. Then about 30 min on the probability exploration with brief recap, collecting Reese's Pieces data at the endI think having them work through the two step process in R was helpful. A fair number struggled with the notation in (b) so good to have asked. Could get different answers for sample size because of the discreteness, how far type I error is from .05. In R invbinom, watch prob-one vs. probl-ell.really need an example or PP that goes in the opposite direction or two-sided! Graphs with (c) are good, but maybe needs to be easier, back when defining the two types of errors? Will be nice next week when have more "fomrulas" to work with (e.g., solving for sample size). R note: Many confusing prob-one with prob-ell and on HW many not realizing that should be level of significance.PP 1.6B (some could even be doing 1.6C at this point)
10
Day 8Students gave reese's counts as walked in. Did 5 min review and then 5 min quiz on level of significance and impact of sample size (or not) on probabilities of errors, with 3 min dicussion quiz answers, spent about 25 minutes on Inv 1.7, with 12 minutes hands on, followed by 10 min of recapThere is a lot of review in this activity, mostly just transitioning to phat instead of X (motivated by wanting to compare distributions on the same scale), but practicing obs unit and variable with a sampling distribution. They were quick to want more samples and emphasized that we have to tell the computer a value for pi then. Emphasize how center is determined by pi and spread is mostly determined by n. We kinda got to showing the SD formulas work, but SD is still new to some of them. Gave optional online practice with normal. Students hadn't really picked up on the "what if" theme, it's less emphasized in earlier material now and probably not needed here. But important to emphasize that the distribution depends on those input values.Told them to finsih Inv 1.7 and do Inv 1.8 (for Monday holiday). Also have project proposal due next class period. Sure seems like could do some streamlining in here?
11
Day 9Had the Monday holiday and asked them to finish Inv 1.7 and 1.8 before class. Spent 10 minutes reviewing previous hw assignment (including perils of multiple testing)and then did a blitz on what they have seen before now, how that relates to the normal distribution, features of the normal distribution, and z-scores for about 15-20 min. then had them work on 1.9 on their own for about 15 minutes. Had 5-10 minutes of wrap up at the end. Most got to the technology. Again, might be more efficient to have them skip to theory-based inference applet/iscamonepropztest, but is maybe a bit of value having them enter variable, mean, and SD in normal first?At the end of 1.9 is a good time to talk about study design issues and scope of conclusions. Probably too quick fo rmany of them but tried to warn them! Good to at least have access to derivation of mean and SD formulas. Good to emphaize how SD(phat) decreases with n but SD(X) does not (many misspoke on HW assignment). Good to highlight some advantages (the SD, empirical rule, and using normal distribution to calculate probabilities - last one less impressive anymore, but will see some nice formulas this way like solving for sample size). Also good to transition to "2SD" as another measure of how unusual an observation is. Good to show them a non .5 (ESP) and a large and a small z-value. A bit strange that PP z-score is the same!some students still struggling with obs unit and variable and parameter but probably could go a tad faster in here (e.g., does CLT apply? need normal for own sake? need more normal practice?) but maybe not as fast as I did. Remind them to be reading the study conclusions.Review the different technology options for one proportion z test, PP 1.9A (a) and (b)
12
Day 10Talked a bit about their project topics than recapped 1.9 and one proportion z, made sure they could use either technology. Then had them do exact binomial and talked them through continuity correction. I think they would have really struggled on their own. Need to check what applet does when use .5. Eventually motivated and verified the continuity correction. Did leave them with Binomial being "best" answer. Then talked them through first part of 1.10, mostly demo'd critical value though (h)-(j) probably better for them to answer on own. Didn't calculate by hand so then (k) and (l) kinda repetitive. Tried to emphasize how nice it is to know phat will be exact center vs. binomial. Tried to get them thinking on (m) overnight. Quickly talked through (n) and encouraged them to look at that and (o)-(r) overnight.Not sure how much time to spend on critical values, so then not sure how much time to spend on normal prob calc applet for its own sake. Some will definitely still struggle on the different z values (z_0 vs. z*), but most seemed to get (repeated) point of more confidence => wider interval and larger n => narrower interval. Good to make sure they know can use either count or proportion as observed value in R and applet. Iscam function for one prop z seems ok with non-integer values to do continuity correction. Good to leave them thinking about what makes an interval procedure "good" (and hinting that binomial may not be the better option this time!)for HW: Some needed big hints on maximizing SD formula and even creating graph vs. n and what mean by "diminishing returns." Also look at graph vs. pi?Inv 1.10(n)-(r), PP 1.10d
13
Minute Papers (sec. 2):
Main themes: About 25% (34 students) felt the pace was too fast and about 25% (some the same, some different) wanted more in class demos with R or will still struggling a bit with R, about 25% commented that the HW wasn't tied closely enough to in class problems. About 41% appeared happy with the current course structure/style.
14
Day 11Did about 15 min review of continuity correction and one proportion z interval, solving for sample size, comparison to binomial. Then picked up with CI sim applet. Did the first bit together an then set them loose on (v)-(y). Brought back together to discuss observations and problem with (aa). Spent about 15 min on 1.11: started together and looked at applet results, left them a few minutes to calculate the intervals themselves and then compared (including vs. binomial). Timing for the day was very good, might have even been able to squeeze in first Gettysburg sample at end?!Seemed to appreciate the applet and definitely played around with it a bit. Some had seen this before, but really tried to emphasize how we judge one interval METHOD to be better than another.. Especially tried to emphasize that increasing n does NOT increase confidence. Had some fun with "plus four" "magically" working but then discussed the shrinkage to 0.5. They definitely need to be reminded that this applet is for exploring the idea and not for analyzing data. Good to emphasize that with large samples the difference in the procedures is much smaller but is never a bad idea to do plus four.Could work in more on the score interval?PP 1.10(c) but getting ready to focus on studying for exam 1, project 1
15
16
Day 12Spent 20 mins on recap, HW 3 comments, and then 5 min on quiz. First section I went way off script to try to finish 1.12 but didn't really work. Second section I stayed much more on script and finished about the same place (barefly finishing 1.12). Had them collect both samples in class and pooled results (just came to front and I typed into spreadsheet). Spent 5 minutes on Literary Digest.They seemed to get the point that want the sample means to center around mu. They seemed to appreciate me givingthem the sampling frame, though in the past they have acted like they have never heard that term. In second section, seemed good to give them a brief contrast to sampling from a process vs. sampling from a population.Seems like flow could improve a bit, especially trying to get them to tell me how we could judge how well the process does by looking at the sample results (how to decide if there is a tendency to overestimate, how to decide the random method is better).. Maybe don't need so much time on types of variable but good to emphasize obs unit and variable. Need to come back to the point that increasingthe sample size doesn't really help with bias. Might change flow a bit, like where introduce definition of parameter and statistic (and maybe don't ask the "do you know it" question in f).PP 1.12B
17
Day 13Still trying to cram in a bit too much. Did a bit more discussion of HW 3 and transitioned to the question there that asked about genearlizabilty. Tried to emphasize you need to consider the variable being meausred as well. In first section I discuss Gallup history and did a brief demo through population size effects on Inv 1.14. In second section I did less of this before quiz. After a bit more on Inv 1.14 and going through "new" CLT, left them about 15 minutes on Inv 1.16 with 5 min recap at end. Many got to p-value but not confidence interval. Emphaized in (g) that we are willing to generalize and why.They still get bored when I lecture but then complain when I don't and they have to teach themselves. Trying hard to do about 20 and 20. It definitely seemed good to have a bit of practice/review today with the normal calculations and whole set up, including defining parameter, many still struggling there. Showed them some excerpts from Gallup.com at the end, including nonsampling errors.Probably have population size ready for 1.16. Definitely need to work some discussion of sampling errors in there, I did with recap of some more recent election mis-predictions (e.g., NH primary in 2008). Was going to throw in claimed to vote vs. actual vote but ran out of time.Requiring everyone to submit questions on material and sample exam questions on line.
18
Day 14 Review day
They are definitely most blown away by power (both in calculating and why it's useful) and continuity correction (both in calculating and why it's useful). Also mixing together Plus Four and Continuity Correction. Also wanting "one best answer" among the choices of proceduresNeed a more motivating example for power, especially if not getting to sample size calculations. Did tell them binomial is best answer for p-values but not necessarily for intervals.
19
Exam 1Paying a bit of a price now for not spending more time on finding z* values directly earlier in course (need t* for PI) and for not having discussed statistical vs. practice significance (Inv 1.17)
20
Change to R functions
Can now enter a vector of confidence levels (e.g., c(.90, .95)) and then display the collection of intervals in one graph panel.
21
22
Day 15Returned exam (average around .82) and briefly discussed (< 10 min). Then began discussion of Ch. 2. Asked them to raise hands if they watched the superbowl and then to raise hands if male or female and asked whether that information would let us answer the research question of whether more males watched than females. Got them to tell me about conditional proportions and even discussion direction of conditioning a bit (5 min). So then started Investigation 2.1 (about 15 min on their own with the simulation)In one class focused on how would now have two variables on each person (for superbowl), but probably is better to focus on two populations instead. Didn't mean to but ended up walking through most of it with them (skipped segmented bar graph in R). Just seems like they would have really struggled on their on but also seems like they completely forget when I tell them when we go through together. Mostly got them to step me through the simulation process but second section couldn't tell me how to get the p-value. Good to plan out the pseudo code!! Good to emphasize that the interpretation of the p-value has not changed at all. Once got into R commands question (o) through them off a bit (is it more extreme if it is positive). Looking at all three graphs might be overkill. In (m), also have them look at two-way table and/or bar graph? Does seem like should make them responsible through part (r) before next class period.They were all over (d) but getting them to suggest taking the difference on their own would have been tough, could focus on "only one number to summarize". When got to inferential section, many want to say 'do a test' so really struggled to explain how to do that from scratch. Good to emphasize the "source" of the "random chance" (the sample ended up with vs. teens hearing loss changing status) See Day 16 comments below, this day might have gone much better with a prepared script so they weren't spending too much time typing in commands.Submit project 1; PP 2.1A; Inv 2.2(a)-(c)
23
Change to R functions
Several of the "test" (inference) functions now return parts of the output (invisible) so you can access them, e.g., with RMarkdown (assign the function to my.output and then print my.output to see the elements, names)
24
Day 16Recapped firstpart of Inv 2.1 and then finished. Didn't quite have time to start 2.2 at all.Gave them an R script that we went through together to create the segmented bar graph (also demo'd Excel and added online instructions) and then did some individual steps of the simulation, showing the new two-way table and segmented bar graph each time (selecting the code and hitting Run repeatedly). I think the visual helped some. Also went through the p-value command piece by piece. Then had the results to look at normal approximation and SD formulas. Emphasized a big advantage of the normal approximation is getting the confidence interval. Showed them the two ways to interpret the interval and also had time to demo switching the roles of group 1 and group 2. Was good to also talk about how would use the simulation results to find a two-sided p-value.I think the R script was helpful and we stayed pretty much together the whole time so make sure tomorrow they do the z-procdure on their own. In both sections emphasized table of COUNTS but in second section forgot to remind them to round if they start with sample proportion. Some did want to know a bit more about why we had to switch SDs. PP 2.1B, Begin Inv 2.2 and 2.3?
25
Day 17Recapped two-sample z procedures (TBI applet) and PP 2.1B. Had prepared a quick quiz on CI interpretations but technology didn't work so had them discuss with partner. Then we did Investigation 2.2 and Investigation 2.3. I had them do part (d) on their own. Then we talked through confounding together and went through Inv 2.3 together. Ended with a couple of recent news articles: longer ring finger vs. index finger and more likely to die from heart attack if have it in hospital vs. before go to hospital.# Hospital is no place for heart attack
# Finger length predicts health and behavior
Some students are concerned about phat vs. phat1 and phat2 (like in SD formula and technical conditions) and need lots of repetition for how to interpret the CI. They wouldn't be able to do much with 2.2(c) on their own, but is good to talk about the different study design and this week's homework had a couple of questions about independent samples or not. 2.2(e) and 2.3(b) are good ones to have them talk to their partner about first. Good to start to distinguish confounding from generalizability. Might need some better counter examples of not confounding. 2.2(g) is really about generalizability. In 2.3(d) they seemed to pick up on how we had isolated the EV, but doesn't really provide great example of assigning EV vs. self-selection. Some were thrown by a zero (simulated) p-value on HW because hadn't done this one yet, so need to make them see that a large difference with decent sample sizes will probably have a pretty small p-value. I think I've been talking way too much this week.PP 2.3 (but be careful because have only defined experiment, not randomized experiment). I also wonder if Investigation 2.5 is one they could do on their own?
26
Day 18A very strange day, essentially did a short quiz (PP 2.3), Investigation 2.4 (on their own for a bit in first section, 24 min), Investigation 2.5 (talked through together, 14 min), and (a)-(d) of Investigation 2.6 (on own in first section) (5+ min). In first section had them turn in a 5-minute attempt at (f)Good to emphasize the table and link to studies had seen (looked real quick at PP 2.4), PP 2.3 still a good one, Talking through 2.5 ok. They gave good answer for (c) - so they don't know which one that led into (d) before talk about placebo. Question (f) felt a little useless but then in second class focused on making sure everyone is measured by an objective observer the same way vs. self-reporting. (g) seemed a little out of place and don't need that much room in (h). In 2.6, we still good to check how some were writing out the variables.First section was pretty out of control (they were hard to corral, I didn't have videos ready to go). Second group was more in control but didn't get as far, not sure why. In second section did get into discussion/demo of "blocking." The first section didn't do so well on answering (f). Some still don't know what mean by "simulation"Asking them to watch the yawning video, get through (f) on Inv 2.6, and PP 2.4. Also posted project 2 guidelines.
27
Day 19Quick recap before quiz on scope of conclusions (10 min); then review and finishing of Investigation 2.6 (tactile and applet simulation, scope of conclusions). 5 min on Investigation 2.7Very good to really emphasize that now we need to know how random assignment behaves rather than random sampling. They still struggled to come up with the design themselves but eventually got one to say flip a coin to decide which group for 30 hypothetical people and then emphasized we also need to model the null hypothesis being true, that the outcome wasn't going to change regardless of which group they got put in to. The tactile simulation went well (2-3 repetitions each) and collecting results at the board. Then worked on applet together, following the questions in the text worked better than usual. Good to emphasize interpretation of p-value (pretty much the same but no long get to say 'by random chance' but have to describe whether the randomness was from smapling or assignment). Good to emphasize in project 2 can't use things like gender and age as EV. Also emphasized that parameter is back to being in terms of probabilities rather than population proportions.Probably would have worked better for Investigation 2.6 to be its own self contained 50-min class. In first class got a bit more "critique" discussion of the Mythbusters study. Good to empahsize equal group sizes is not a major problem. Is 2.7(c) worthwhile?Asking to finish through (e) on Inv 2.7; PP 2.6 especially part (d) is good follow-up; also encouraged them to read background of Inv 2.8
28
Day 20Discussed HW 4 and then did recap (7 min) PP 2.6 (4 min), and 5 question quiz (7 min). Then Investigation 2.7 starting with (f). Had them open raw data and discussed different ways to put into applet and also showed links to R and Excel methods.30 minutes was actually enough time to get them through Fisher's Exact Test (used Excel to do the P(X = 10) together and then gave different tables other k values, then summed together. Had 5 min at end for them to practice setting up (o) or (p). Tried to emphasize the need for practice converting from table to hypergeometric inputs as that has been problematic in the past. Good to emphasize that we are counting the different ways to get the possible tables and good to emphasize if change the value of the upper left cell then the other cell values have to change too to keep the margins fixed.Applet may have had some issues. When change definition of success either need to change order of rows in two-way table or not call it "cell 1 count". Probably need option for naming rows and columns if using 2x2 cell entry. Applet's default is to treat first outcome listed as success so can choose which row to have first!Suggested both 2.7 PPs. Asked to start Inv 2.8 (a)-(d)
29
Day 21A bit more on HW 4 and projects, review of FET and (o) and (p) (10 min), PP2.7 (both)( 10 + min), Inv 2.8 (25-30 min)HW discussion and formula review a little boring, PP discussion ok. Discussion of (o) and (p) good. Definitely need to talk through two-sided and how find other side (e.g., distance from mean). Probably best in context of dolphin study (add to PP?). Second section had good reaction to whether Investigation 2.8 study was ethical. Gave them time to find FET p-value on own. Also gave them time to find normal p-value on own. Good discussion of continuity correction (though wanted to know how they could do the correction themselves). Left them to think about why p-value said reject at 10% level yet 0 was in 90% CI.Need to fix bug with two-sided p-value for FET. Need a study conclusions box! Good to show normal overlaid on applet but is my current choice of height the best?PP 2.8
30
Day 22Recapped including highlight the formulas from two-sample z and accompanying implications on factor influence p-value, CI. Also review PP and four components of study conclusions. Last 30 min on Inv 2.9, got to about (m)Motivation of relative risk was a little tough but pretty much got there. Good point that p-value is the same but need a confidence interval. Would be better to have a more skewed rel risk distribution and a nonzero p-value (to see the matching in the p-value count. Someone did ask if presenting one statistic rather than the other would be "skewing" the results.Study description (variables) could be simplified.Need to be careful in being consistent in how set up ratio. R instructions had lots of typos. many thought they had to fillout a table, would be better if could see the columns of results? (Including one that shows all 3 statistics are significant together for the row/table) Maybe demo 1-2 rows next class. Maybe don't waste time on phat's and difference? In the last couple of investigations I've missed having them first write out the hypotheses.Finish 2.9, PP 2.9
31
Day 23Finished Inv 2.9 (went through R script together (10 min), calculated and interpret CI for rel risk (20min)) and then did shortened version of 2.10 (15 min) Good to go through simulation a bit more slowly (e.g., what does the very first value represent). Using "head" in R was useful. Was trying to show how each count leads to unique difference and unique relative risk. Did have to reassure why using the log didn't throw things off). I kinda liked the shortened 2.10 trying to make two points, RR differs depending on how you define things and can't use with case control. Not sure how slow to go with odds ratio and was running out of time. Did get them to the R command.SE from simulation is based on normal distribution and SE from formula just uses sample data so will differ a fair bit. Could calculate the SE under the null as well and match that to the simulation? Maybe second RR example should be reversing roles of EV and RV instead. A bit overwhelming right before exam but review next day may have helped. Should have discussed PP 2.9.PP 2.10
32
2.10 replacement
33
Day 24ReviewQuestions submitted online last night were better than for exam 1. Mostly on when to use what (e.g., RR vs. OR and FET vs. normal)Some still struggling on random assignment vs. random sampling. Did try to show them how the binomial and hypergeometric calculations are related. Also tried to disguish that population condition is for binomial to approximate hypergeometric vs sample size condition is for normal approximation.
34
35
Day 25A few minutes on exam 2 (<5), about 5 min on data collection and preview of quantitative data, 25-30 min on their own Inv 3.1 (with an R script), 15 min wrap up including 5 min on PP 3.1BWent pretty well, especially in second class where they stayed a bit more on task. During wrap up, asked which time period they would have rather visited and got individuals to argue for each (shorter wait time vs. more predictable). Was good to go over PP 3.1B, asked for votes by show of hands and were pretty split initially. Did the summary showing off the Desc. Stat. applet and some seemed happy to have that alternative.R a bit of a pain but gave them plenty of warning. Showed both pasting url and read.table. Couldn't get file.choose working for me last night. Probably shouldn't have skipped (a) and (b). Not everyone got to the deleting of outliers. R script showed off the "which" command. Need lots of reminders to attach (and to ignore warnings if attach edited data with same variable names). Also needed more reminders about "+":
36
Change to R functions
Should now be able to include labels for both boxplots and histograms
37
Day 26about 20 min on recap (include class discussion of PP 3.1C) and quiz, then 30 min on Inv 3.2quiz had them work together and seemed to have some good discussion and follow up questions. Trying to save time I split them across the three populations and then recapped the results. seemed to get main point but typo in clt box. Might worth having them explore how small the sample size can be before they stop thinking their sampling distribution is approximately normal.timing not great, but focused on having them fill out table in r. should check of sd formula should move up? use applet to approx pvalue and dont need normal? maybe focus on what 2sd would tell them here? need to better motivate them telling the research question and designing the simulation.
38
change to descriptive stats applet
should now be able to toggle categorical and quantitative variables when inputting stacked data
39
Day 27Talked a bit more about Inv 3.2 (explored smallest sample size to get normal distrbution, projected results from different student computers to discuss) and CLT (about 15 min) and then Inv 3.3 through part (o). In last 5 minutes demonstrated the technology in Inv 3.4. (need raw data to demo t.test)Flow pretty good but seemed harder to engage them. Both classes asked why we didn't just always use t. At beginning asked them to look at one of the three weight populations to see how small they could make the sample size and still be willing to assume had normal distribution of sample means. Took some time but seemed to nicely drive home the point of the role of sample size in determining the shape depending on shape of population. Using normal probability plots would be pretty helpful here. Was good to move up hypotheses to 3.3(a) and focus on the research question. Would be good to ask a question roughly designing the simulation needed. (e) was good (if don't jumpt the gun). In (j) was good to make them realize that the "2" cutoff they want to use depends on normality and while xbar is normal, this standardized statistics may not be. Seemed to be good to wait and do both CI and PI the next day. (and how not 98.6 but not that different from 98.6)Some getting overwhelmed by all the SDs (be sure to show the three graphs in the applet and how they correspond to sigma, s, and sigma/sqrtn. Might be better to not use .733 in hypothetical body tem distribution? Probably need to use a smaller sample size than 13 for them to appreciate the t vs. the normal. (m) might be a little tough for them. Add discussion onhow hae two sources of variabiltiy and that's why need the heavier tails. Technology detour for R could differentiate more clearly using t.test for raw data and onesamplet for summary data.
40
New applet feature
With this version we are testing out a slider on the population meanhttp://www.rossmanchance.com/applets/OneSample45.htm
41
Day 28Recapped one-sample t and the technology options (10-15 min) then did the conf sim exploration in Inv 3.3. Then summarized results and moved on to prediction interval. About 5 min at end for PP 3.4.BI liked the flow, "what next" after p-value is confidence intervals, motivate switching to t critical value with review of how to determine whether a CI procedure "works", interpreting interval and then motivating prediction intervals, be sure to emphasize need population normality for PI. Handwaving the adding of the two sources of variability seemed good (though not totally clear why need s/sqrt(n) still). Was good to have graph of different t distributions handy when talked about impact of df on p-value and df.Should probably at least demo robustness of t-intervals. Didn't seem happy to not have a non-R version for finding t*. Needed more time to discuss PP 3.4B.PP 3.4B and (a)-(g) of Inv 3.7!
42
New applethttp://www.rossmanchance.com/applets/tCalc.htmt probability calculations
43
Day 29Did a huge amount of review but seemed appropriate, then 10 minutes on quiz, and only spent about 15 min on Inv 3.7, but through simulation and conclusions (nothing written down yet)Discussed CI vs. PI and then PP 3.4B. Hardest question on quiz was recognizing none of the above were a correct interpretation of confidence interval. On HW have to watch what they aveage over (e.g., average sleep for a student - over several nights? or average sleep for all students that night). Did talk about one-sided vs. two-sided conclusions in 3.4B. (Could say more in terms of power?) They were able to step me through the simulation and we wrote it out, but didn't take time to write numbers on cards, went straight to applet. Seemed to appreciate the visual and seemed to hang in there through the end.If time, might still be value to doing the tactile simulation once? or could just do medians? on HW 6, lots of trouble knowing what to plug into count samples box with a one-sample t, but here seemed a bit easier? do need to watch order of subtraction (pasting in data is a different order of subtraction from going to default option, latter also saves some time!)PP 3.7B, technology instructions
44
Day 30about ten minutes on one variable applet (eg sample vs sampling distribution, notation),25-30 min on inv 3.7 and then 10 min doing the ttest using technology and 5 min recapseemed good enough to talk through exact pvalue, also threw in using medians and looking at applet results/discussing how to change simulation, but then not normal, and then transition to t. Seemed wothwhile to show se is underestimate and to compare the two pvalues (I led all this) and then comment on how would rather have pvalue a little too big rather than a little too small (was nice to have just talked about exact pvalue and could keep pointing back to that). Finished by having them interpret confidence interval. Seemed worthwhile to show the other se formula (calculate) and the r ouput for the pooled tstill not sure how to interpret parameter here, tried to go back to long run or underlying but later thought about the mean response under treatment one compared to the mean response under treatment two? main issue was not having all the technology ready to go (for me) in advance. might also want df formula handy.PP 3.8(a)-(d), don't need other questions if given in earlier PP, preview Investigation 3.9
45
Day 31Recapped two-sample t procedures for 8-9 min (again advantages of t is confidence interval and formulas (and can use when only have summary statistics)) and factors that affect p-value and width (now also SD) including discussion of variation of PP 3.7B, then a quick quiz (6 min), and then had them do (a)-(c) of Inv 3.9 (10-12 min) before moving on to 3.10 (25 min). Discussed why mean may not be best measure of center and so had them do simulation of difference in medians and then also discussed transforming to make symmetric distributions. Had them create the transformed variable in R or Excel (showed the Data > Text To Columns trick) and create graphs of the new distributions. Mentioned had to interpret parameter carefully (mean of log survival). A little rushed for time (barely got to (k) of 3.10). One section was a lot more likely to put xbar's in Ho/Ha than the other so gave them the "no hat" joke, seemed like that was good review. Changed the confidence level to 90% in 3.9 so easier to interpret (admitted it was cheating but they seemed to appreciate). Definitely needed the technology practice for transformations. Good to highlight two differnet issues iwth t-test, maybe don't want to use median and maybe too skewed to be valid, and then presented the two approaches as different possible "corrections." Will be lots of questions of which is best. I skipped 3.10f.Paying a bit of a price now for not spending more time on finding z* values directly earlier in course (need t* for PI) and for not having discussed statistical vs. practice significance (Inv 1.17) PP 3.6 (p. 228)
46
Day 32Did recap and then quiz (20 min), then did (h) and second data collection of 3.11, looked at resultsI just copied the data and did a two-sample t-test on the first group and got them to consider the large s values and suggest the paired design,then I was copied the new data and did the two-sample set up and got them to tell me why that was incorrect, so we looked at the differences and a one-sample t-test and then started to think about a simulation approach. Good to emphasize how it will allow us to "review" one-sample t-procedures.The google docs worked pretty well but have to make sure anyone with link can view/edit responses. Will prepare data file (perhaps combining the two sections together) for tomorrow. Both (two-sided - need to decide in advance) p-values were right around .06! They were pretty good about entering numbers only. forgot to announce PP 3.11APP 3.11A
47
Day 33Reviewed the two study designs (10 min), quiz (7 min), and then Investigation 3.12 with applet and RThought might have time for sign test but only got to allude to it. Did discuss shopping data a bit - good to remind them have pairing with observational studies and can have pairing with mutiple individuals (esp before Quiz = 3.11B). Would have been good to have twins example ready to discuss. Results on designing the simulation were a bit mixed, had to help them a lot. Helped them paste data into applet and then ok through paired t-tests. Very quick interpretation of CI interval at endIs paired t.test worth it vs. just finding the differences. I skipped over the stacking but do they need to for "parallel" graphs on HW? Applet output is enough? Make sure have CRD output ready to discuss.Review Inv 3.13, Example 3.4, PP 3.12
48
Day 34Recapped matched pairs and PP 3.12, More discussion on whether pairing is useful and showed scatterplots, Recpaped Investigation 3.13 (outliers, validity, CI vs. PI), transitioned to sign test, ended comfortably with Inv 3.14 (bt didn't do simulation, went straight to binomial)Went pretty well, but I led them through everything, wrote out a sign test for chip data after PP 3.13. Was good in one section to reference the earlier Wall Street Journal survey that was paired responses. Was good to talk about ties with shopping data and how to handle those in a sign test. I think good to write out the parameter and X and p-value definitions for these last tests and to think about what the interval would tell us. Good to emphasize the student conclusion boxes with them.PP 3.14
49
Last day!Recapped Day 34 (6 min), then talked through 1.17(8 min) and 3.6 (5-6 min) and one on resistance, then review questions, left 10 min for cours eevaluationsNeed big reminders that sign test is for quantitative response and McNemar is for categorical response. Also need reassurance on only using a subset of the data with McNemar. Can review when normal approximation would be valid. Seemed good to go over the "which procedure" example together. Also added Kristen Gilbert to emphasize study may not have randomness but p-value may still have meaning.Many think if pairing turns out to not have been helpful they should use a two-sample t-test to anlayze the data instead despite my repeatedly saying if data is paired must be anlayzed that way.Maybe should show the version of the formula with the correlation coefficient!
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100