ABCDEFGHIJKLMNOPQRSTUVWXYZ
1
East-west beer price comparison using NHST
2
3
Assumptions:
4
1
Beer prices in east-end and west-end normally distributed with uknown means mu_E and mu_W
5
2Variance = sigma^2 =5
(assumed to be known)
6
7
Step 1
8
H_0same meanH_A
means are different
9
Δ = 0 Δ = -2.62
An assmption about the effect size we're interested in detecting
10
11
12
Step 2-1.640.05050258347
13
alpha =0.05-3.33750.0004226786094
14
z_\alpha-1.644853625
15
beta = 0.2
16
statistical power = 0.8
17
z_\beta0.8416212327
18
19
20
Step 3-1.6448536250.8416212327
21
Need to choose sample size n and cutoff c required to guarantee the chosen error rates \alpha and \beta
22
c = -1.644853625*SE(n)
c = Δ + 0.8416212327*SE(n)
A. Table solution method
23
solving these two equations simultanously for n and c we find
nSE(n)-1.644853625*SE(n)
Δ + 0.8416212327*SE(n)
24
n =981.118033989-1.839002259-1.679038856
25
91.054092553-1.733827958-1.732853326
26
101-1.644853625-1.778378767
27
SE(8) = 1.054092553110.9534625892-1.568306396-1.81754564
28
120.9128709292-1.501539057-1.851708443
29
130.8770580193-1.442632062-1.881849349
30
c = -1.733827958(via alpha req)c=-1.732853326(via beta req)140.8451542547-1.39015504-1.908700234
31
150.8164965809-1.343017361-1.932819141
32
Data samples collected from the two populations:
160.790569415-1.300370968-1.954639994
33
x_Ex_W170.7669649888-1.261545142-1.974505981
34
7.711.8180.7453559925-1.226001506-1.992692571
35
5.910190.7254762501-1.19330224-2.009423784
36
711200.7071067812-1.163087152-2.024883919
37
4.88.6
38
6.38.3
39
6.39.4
B. Formula solution method
40
5.58n = 9.00669719
41
5.46.8
42
6.58.5
C. Solution using SymPy method
43
https://live.sympy.org/?evaluate=from%20sympy%20import%20*%0Afrom%20sympy.stats%20import%20P%2C%20E%2C%20variance%2C%20Die%2C%20Normal%2C%20cdf%2C%20density%2C%20std%0A%23--%0An%2C%20c%20%3D%20symbols(%27n%20c%27)%0Asigmasq%20%3D%205%20%20%23%20%0ASE%20%3D%20sqrt(sigmasq%2Fn%20%2B%20sigmasq%2Fn)%0ASE%0A%23--%0A%23%20We%20solve%20for%20c%20and%20n%20simultaneously%20for%20fixed%20significance%20and%20power%20level%0A%23%20solve%20%20eqn1%20%20%20%20and%20%20%20%20%20%20%20eqn2%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20for%20unknowns%20n%20and%20c%20%0Asolve(%5B%20%20c%2B1.644853625*SE%2C%20c%2B2.62%20-%200.8416212327*SE%5D%2C%20%20%20%20%20%20%20%20%20%20%20%20%20n%2C%20%20%20%20c%20)%0A%23--%0A%23%20So%20we%20need%20sample%20size%20n%3D9%2C%20and%20cutofff%0Ac%20%3D%20-1.644853625*SE.subs(n%2C9).n()%0Ac%0A%23--%0A
44
sum x55.482.4
45
\bar{x}6.1555555569.155555556d = -3
46
z = -2.846049894
47
48
Step 4
49
Decision = reject H_0because d < c
(equivalently because z < z_\alpha)
50
51
52
53
Step 5 -- computing the p-value
54
p-value = 0.002213262929
statistically significant since p-value < alpha-level chosen
55
56
57
Step 5 -- Confidence interval of the effect size
58
\gamma0.1
= 90% confidence interval
59
1-\gamma/20.95
60
z_{1-gamma/2)1.644853625
61
variance of CI1.111111111
62
sqrt of variance of CI1.054092553
63
CI centre = -3
64
CI lower =-4.733827958
65
CI upper = -1.266172042
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100