Binomial distribution
To save and make a local (editable) copy, do: File, Make a copy. �
Advanced High School Statistics
Slides developed by Mine Çetinkaya-Rundel of OpenIntro, modified by Leah Dorazio for use with AHSS.
The slides may be copied, edited, and/or shared via the CC BY-SA license
Some images may be included under fair use guidelines (educational purposes)
Recall the Binomial formula
If p represents probability of success, (1-p) represents probability of failure, n represents number of independent trials, and k represents number of successes
The Binomial distribution
e.g. If the probability of a severe lung condition for a smoker = 0.3, what is the distribution of number of cases of severe lung condition among 4 randomly chosen friends who smoke?�
Find the probabilities where k = 0, 1, 2, 3, 4 using the binomial formula for each value of k. Note that n and p are fixed.
The Binomial distribution (cont.)
e.g. If the probability of a severe lung condition for a smoker = 0.3, what is the distribution of number of cases of severe lung condition among 4 randomly chosen friends who smoke?�
Find the probabilities where k = 0, 1, 2, 3, 4 using the binomial formula for each value of k.
The entire distribution is defined below. Note that, correcting for rounding error, the probabilities must add
to 1.
The Binomial distribution (cont.)
Once the probabilities of each value are calculated using the binomial formula, a probability histogram can be drawn in order to visualize the distribution. Like any distribution, the binomial distribution has a mean and a standard deviation. �
The Binomial distribution (cont.)
Recall the formulas from the previous chapter for calculating mean and standard deviation of a probability distribution.
�
Fortunately, for the binomial distribution with parameters n and p, there exist short-cut formulas for finding the mean and standard deviation.
Mean or Expected value
A 2012 Gallup survey suggests that 26.2% of Americans are obese. Among a random sample of 100 Americans, how many would you expect to be obese?
Mean or Expected value
A 2012 Gallup survey suggests that 26.2% of Americans are obese. Among a random sample of 100 Americans, how many would you expect to be obese?
Mean or Expected value
A 2012 Gallup survey suggests that 26.2% of Americans are obese. Among a random sample of 100 Americans, how many would you expect to be obese?
Mean and Standard deviation of a binomial distribution
Going back to the obesity rate:
_________
Note: Mean and standard deviation of a binomial might not always be whole numbers, and that is alright, these values represent what we would expect to see on average.
We would expect 26.2 out of 100 randomly sampled Americans to be obese, with a standard deviation of 4.4.
Unusual observations
Using the notion that observations that are more than 2 standard deviations away from the mean are considered unusual and the mean and the standard deviation we just computed, we can calculate a range for the plausible number of obese Americans in random samples of 100.
26.2 ± (2 x 4.4) → (17.4, 35.0)
Practice
An August 2012 Gallup poll suggests that 13% of Americans think home schooling provides an excellent education for children. Would a random sample of 1,000 Americans in which only 100 share this opinion be considered unusual?
(a) Yes (b) No
http://www.gallup.com/poll/156974/private-schools-top-marks-educating-children.aspx
Practice
An August 2012 Gallup poll suggests that 13% of Americans think home schooling provides an excellent education for children. Would a random sample of 1,000 Americans where only 100 share this opinion be considered unusual?
(a) Yes (b) No
http://www.gallup.com/poll/156974/private-schools-top-marks-educating-children.aspx
Distributions of number of successes
Hollow histograms of samples from the binomial model �where p = 0.10 and n = 10, 30, 100, and 300. �What happens as n increases?
Note: the scales on the histograms are different!
See this applet with sliders for n and p to see how shape binomial distribution changes as n and p change:
http://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm
How large is large enough to use normal approximation?
The sample size is considered large enough if the expected number of successes and failures are both at least 10, that is, if �
np ≥ 10 and n(1-p) ≥ 10
Observe that when n= 30 and p = 0.10�np = 30 x 0.10 ≈ 3 < 10 (fail!)
n(1-p) = 30 x (1 - 0.1) = 27 ≥ 10�
But when n = 100 and p = 0.10
np = 100 x 0.10 ≈ 10 ≥ 10
n(1-p) = 100 x (1 - 0.1) = 90 ≥ 10
This is consistent with our visual judgement of normality.
Practice
Below are four pairs of Binomial distribution parameters. Which distribution can be approximated by the normal distribution?
Below are four pairs of Binomial distribution parameters. Which distribution(s) can be approximated by the normal distribution?�
Practice
Below are four pairs of Binomial distribution parameters. Which distribution can be approximated by the normal distribution?
Below are four pairs of Binomial distribution parameters. Which distribution(s) can be approximated by the normal distribution?
An analysis of Facebook users
http://www.pewinternet.org/Reports/2012/Facebook-users/Summary.aspx
A recent study found that ``Facebook users get more than they give". For example:
Any guesses for how this pattern can be explained?
An analysis of Facebook users
Power users contribute much more content than the typical user.
http://www.pewinternet.org/Reports/2012/Facebook-users/Summary.aspx
A recent study found that ``Facebook users get more than they give". For example:
Any guesses for how this pattern can be explained?
Practice
P(X ≥ 70) = P(K = 70 or K = 71 or K = 72 or … or K = 245)� = P(K = 70) + P(K = 71) + P(K = 72) + … + P(K = 245)
This seems like an awful lot of work...
This study also found that approximately 25% of Facebook users are considered power users. The same study found that the average Facebook user has 245 friends. What is the probability that the average Facebook user with 245 friends has 70 or more friends who would be considered power users? Note any assumptions you must make.
We are given that n = 245, p = 0.25, and we are asked for the probability P(K ≥70). To proceed, we need independence, which we'll assume but could check if we had access to more Facebook data.
Normal approximation
to the binomial
Practice
What is the probability that the average Facebook user with 245 friends has 70 or more friends who would be considered power users?
P(Z > 1.29) = 0.0985
Explore more free resources at openintro.org/ahss, including:
Teachers only content is also available for Verified Teachers, including
Questions? Contact us.