Chap11_Chie Square & Non Parametrics
-
Upload
novyakurnianingputri -
Category
Documents
-
view
226 -
download
0
Transcript of Chap11_Chie Square & Non Parametrics
-
8/12/2019 Chap11_Chie Square & Non Parametrics
1/47
-
8/12/2019 Chap11_Chie Square & Non Parametrics
2/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-2
Chapter Goals
After completing this chapter, you should be
able to:
Perform a 2test for the difference between two
proportions Use a 2test for differences in more than two proportions
Perform a 2test of independence
Apply and interpret the Wilcoxon rank sum test for the
difference between two medians
Perform nonparametric analysis of variance using the
Kruskal-Wallis rank test for one-way ANOVA
-
8/12/2019 Chap11_Chie Square & Non Parametrics
3/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-3
Contingency Tables
Contingency Tables
Useful in situations involving multiple population
proportions Used to classify sample observations according
to two or more characteristics
Also called a cross-classification table.
-
8/12/2019 Chap11_Chie Square & Non Parametrics
4/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-4
Contingency Table Example
Left-Handed vs. Gender
Dominant Hand: Left vs. Right
Gender: Male vs. Female
2 categories for each variable, so
called a 2 x 2 table
Suppose we examine a sample of
size 300
-
8/12/2019 Chap11_Chie Square & Non Parametrics
5/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-5
Contingency Table Example
Sample results organized in a contingency table:
(continued)
Gender
Hand Preference
Left Right
Female 12 108 120
Male 24 156 180
36 264 300
120 Females, 12
were left handed
180 Males, 24 were
left handed
sample size = n = 300:
-
8/12/2019 Chap11_Chie Square & Non Parametrics
6/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-6
2Test for the DifferenceBetween Two Proportions
If H0is true, then the proportion of left-handed females should be
the same as the proportion of left-handed males
The two proportions above should be the same as the proportion of
left-handed people overall
H0: p1= p2 (Proportion of females who are left
handed is equal to the proportion of
males who are left handed)
H1: p1p2 (The two proportions are not the sameHand preference is notindependent
of gender)
-
8/12/2019 Chap11_Chie Square & Non Parametrics
7/47Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-7
The Chi-Square Test Statistic
where:
fo= observed frequency in a particular cell
fe= expected frequency in a particular cell if H0is true
2for the 2 x 2 case has 1 degree of freedom
(Assumed: each cell in the contingency table has expected
frequency of at least 5)
cellsall e
2
eo2
f
)ff(
The Chi-square test statistic is:
-
8/12/2019 Chap11_Chie Square & Non Parametrics
8/47Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-8
Decision Rule
2U
Decision Rule:
If 2> 2U, reject H0,
otherwise, do not
reject H0
The 2test statistic approximately follows a chi-
squared distribution with one degree of freedom
0
Reject H0Do notreject H0
-
8/12/2019 Chap11_Chie Square & Non Parametrics
9/47Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-9
Computing theAverage Proportion
Here:120 Females, 12were left handed
180 Males, 24 were
left handed
i.e., the proportion of left handers overall is 12%
n
X
nn
XXp
21
21
12.0300
36
180120
2412p
The average
proportion is:
-
8/12/2019 Chap11_Chie Square & Non Parametrics
10/47Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-10
Finding Expected Frequencies
To obtain the expected frequency for left handed
females, multiply the average proportion left handed (p)
by the total number of females
To obtain the expected frequency for left handed males,multiply the average proportion left handed (p) by the
total number of males
If the two proportions are equal, then
P(Left Handed | Female) = P(Left Handed | Male) = .12
i.e., we would expect (.12)(120) = 14.4 females to be left handed
(.12)(180) = 21.6 males to be left handed
-
8/12/2019 Chap11_Chie Square & Non Parametrics
11/47Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-11
Observed vs. ExpectedFrequencies
Gender
Hand Preference
Left Right
FemaleObserved = 12
Expected = 14.4
Observed = 108
Expected = 105.6120
Male
Observed = 24
Expected = 21.6
Observed = 156
Expected = 158.4 180
36 264 300
-
8/12/2019 Chap11_Chie Square & Non Parametrics
12/47Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-12
Gender
Hand Preference
Left Right
FemaleObserved = 12
Expected = 14.4
Observed = 108
Expected = 105.6120
MaleObserved = 24
Expected = 21.6
Observed = 156
Expected = 158.4180
36 264 300
6848.0
4.158
)4.158156(
6.21
)6.2124(
6.105
)6.105108(
4.14
)4.1412(
f
)ff(
2222
cellsall e
2
eo2
The Chi-Square Test Statistic
The test statistic is:
-
8/12/2019 Chap11_Chie Square & Non Parametrics
13/47Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-13
Decision Rule
Decision Rule:
If 2
> 3.841, reject H0,otherwise, do not reject H0
3.841d.f.1with,6848.0isstatistictestThe2
U
2
Here,
2= 0.6848 < 2U= 3.841,
so we do not reject H0and conclude that there is
not sufficient evidence
that the two proportions
are different at = .05
2U=3.841
0
Reject H0Do notreject H0
-
8/12/2019 Chap11_Chie Square & Non Parametrics
14/47Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-14
Extend the 2test to the case with more than
two independent populations:
2Test for the Differences inMore Than Two Proportions
H0: p1= p2= = pc
H1: Not all of the pjare equal (j = 1, 2, , c)
-
8/12/2019 Chap11_Chie Square & Non Parametrics
15/47Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-15
The Chi-Square Test Statistic
where:
fo= observed frequency in a particular cell of the 2 x c table
fe= expected frequency in a particular cell if H0is true
2for the 2 x c case has (2-1)(c-1) = c - 1 degrees of freedom
(Assumed: each cell in the contingency table has expected
frequency of at least 1)
cellsall e
2
eo2
f
)ff(
The Chi-square test statistic is:
-
8/12/2019 Chap11_Chie Square & Non Parametrics
16/47Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-16
Computing theOverall Proportion
n
X
nnn
XXXp
c21
c21
The overall
proportion is:
Expected cell frequencies for the c categories
are calculated as in the 2 x 2 case, and the
decision rule is the same:
Decision Rule:If 2> 2U, reject H0,
otherwise, do not
reject H0
Where 2Uis from thechi-squared distribution
with c1 degrees of
freedom
-
8/12/2019 Chap11_Chie Square & Non Parametrics
17/47Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-17
2Test of Independence
Similar to the 2test for equality of more than
two proportions, but extends the concept to
contingency tables with r rowsand c columns
H0: The two categorical variables are independent
(i.e., there is no relationship between them)
H1: The two categorical variables are dependent(i.e., there is a relationship between them)
-
8/12/2019 Chap11_Chie Square & Non Parametrics
18/47Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-18
2Test of Independence
where:
fo= observed frequency in a particular cell of the r x c table
fe= expected frequency in a particular cell if H0is true
2for the r x c case has (r-1)(c-1) degrees of freedom
(Assumed: each cell in the contingency table has expected
frequency of at least 1)
cellsall e
2
eo2
f
)ff(
The Chi-square test statistic is:
(continued)
-
8/12/2019 Chap11_Chie Square & Non Parametrics
19/47Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-19
Expected Cell Frequencies
Expected cell frequencies:
n
totalcolumntotalrow
fe
Where:
row total = sum of all frequencies in the rowcolumn total = sum of all frequencies in the column
n = overall sample size
-
8/12/2019 Chap11_Chie Square & Non Parametrics
20/47Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-20
Decision Rule
The decision rule is
If 2> 2U, reject H0,
otherwise, do not reject H0
Where 2Uis from the chi-squared distribution
with (r1)(c1) degrees of freedom
-
8/12/2019 Chap11_Chie Square & Non Parametrics
21/47Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-21
Example
The meal plan selected by 200 students is shown below:
Class
Standing
Number of meals per week
Total20/week 10/week none
Fresh. 24 32 14 70
Soph. 22 26 12 60
Junior 10 14 6 30Senior 14 16 10 40
Total 70 88 42 200
-
8/12/2019 Chap11_Chie Square & Non Parametrics
22/47Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-22
Example
The hypothesis to be tested is:
(continued)
H0: Meal plan and class standing are independent
(i.e., there is no relationship between them)
H1: Meal plan and class standing are dependent
(i.e., there is a relationship between them)
-
8/12/2019 Chap11_Chie Square & Non Parametrics
23/47Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-23
Class
Standing
Number of mealsper week
Total20/wk 10/wk none
Fresh. 24 32 14 70
Soph. 22 26 12 60
Junior 10 14 6 30
Senior 14 16 10 40
Total 70 88 42 200
Class
Standing
Number of mealsper week
Total20/wk 10/wk none
Fresh. 24.5 30.8 14.7 70
Soph. 21.0 26.4 12.6 60
Junior 10.5 13.2 6.3 30
Senior 14.0 17.6 8.4 40
Total 70 88 42 200
Observed:
Expected cell
frequencies if H0is true:
5.10
200
7030
n
totalcolumntotalrowfe
Example for one cell:
Example:Expected Cell Frequencies
(continued)
-
8/12/2019 Chap11_Chie Square & Non Parametrics
24/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-24
Example: The Test Statistic
The test statistic value is:
709.04.8
)4.810(
8.30
)8.3032(
5.24
)5.2424(
f
)ff(
222
cellsall e
2
eo2
(continued)
2U= 12.592 for = .05 from the chi-squareddistribution with (41)(31) = 6 degrees of
freedom
-
8/12/2019 Chap11_Chie Square & Non Parametrics
25/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-25
Example:Decision and Interpretation
(continued)
Decision Rule:
If 2
> 12.592, reject H0,otherwise, do not reject H0
12.592d.f.6with,709.0isstatistictestThe2
U
2
Here,
2= 0.709 < 2U= 12.592,
so do not reject H0
Conclusion:there is notsufficient evidence that meal
plan and class standing are
related at = .05
2U=12.592
0
Reject H0Do notreject H0
-
8/12/2019 Chap11_Chie Square & Non Parametrics
26/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-26
Wilcoxon Rank-Sum Test forDifferences in 2 Medians
Test two independent population medians
Populations need not be normally distributed
Distribution free procedure
Used when only rank data are available
Must use normal approximation if either of the
sample sizes is larger than 10
-
8/12/2019 Chap11_Chie Square & Non Parametrics
27/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-27
Wilcoxon Rank-Sum Test:Small Samples
Can use when both n1 , n2 10
Assign ranks to the combined n1+ n2sample
observations
If unequal sample sizes, let n1refer to smaller-sized
sample
Smallest value rank = 1, largest value rank = n1+ n2
Assign average rank for ties
Sum the ranks for each sample: T1 and T2
Obtain test statistic, T1 (from smaller sample)
-
8/12/2019 Chap11_Chie Square & Non Parametrics
28/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-28
Checking the Rankings
The sum of the rankings must satisfy the
formula below
Can use this to verify the sums T1and T2
2
1)n(nTT 21
where n = n1+ n2
-
8/12/2019 Chap11_Chie Square & Non Parametrics
29/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-29
Wilcoxon Rank-Sum Test:Hypothesis and Decision Rule
H0: M1= M2
H1: M1M2
H0: M1M2
H1: M1>M2
H0: M1M2
H1: M1 T1U
Reject H0if T1< T1L Reject H0if T1> T1U
-
8/12/2019 Chap11_Chie Square & Non Parametrics
30/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-30
Sample data is collected on the capacity rates
(% of capacity) for two factories.
Are the median operating rates for two factoriesthe same?
For factory A, the rates are 71, 82, 77, 94, 88
For factory B, the rates are 85, 82, 92, 97
Test for equality of the sample medians
at the 0.05 significance level
Wilcoxon Rank-Sum Test:Small Sample Example
-
8/12/2019 Chap11_Chie Square & Non Parametrics
31/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-31
Wilcoxon Rank-Sum Test:Small Sample Example
Capacity Rank
Factory A Factory B Factory A Factory B
71 1
77 2
82 3.5
82 3.5
85 5
88 6
92 7
94 8
97 9
Rank Sums: 20.5 24.5
Tie in 3rdand
4thplaces
Ranked
Capacity
values:
(continued)
-
8/12/2019 Chap11_Chie Square & Non Parametrics
32/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-32
Wilcoxon Rank-Sum Test:Small Sample Example
(continued)
Factory B has the smaller sample size, so
the test statistic is the sum of the
Factory B ranks:
T1= 24.5
The sample sizes are:
n1= 4(factory B)
n2= 5(factory A)
The level of significance is = .05
-
8/12/2019 Chap11_Chie Square & Non Parametrics
33/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-33
n2
n1
One-
Tailed
Two-
Tailed4 5
4
5
.05 .10 12, 28 19, 36
.025 .05 11, 29 17, 38
.01 .02 10, 30 16, 39.005 .01 --, -- 15, 40
6
Wilcoxon Rank-Sum Test:Small Sample Example
Lower and
Upper
Critical
Values forT1from
Appendix
table E.8:
(continued)
T1L = 11 and T1U = 29
-
8/12/2019 Chap11_Chie Square & Non Parametrics
34/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-34
H0: M1= M2
H1: M1M2
Two-Tail Test
Reject
T1L=11 T1U=29
RejectDo Not
Reject
Reject H0if T1< T1L=11
or if T1> T1U=29
= .05
n1= 4 , n2= 5Test Statistic (Sum of
ranks from smaller sample):
T1= 24.5
Decision:
Conclusion:
Do not rejectat = 0.05
There is not enough evidence to
prove that the medians are not
equal.
Wilcoxon Rank-Sum Test:Small Sample Solution
(continued)
-
8/12/2019 Chap11_Chie Square & Non Parametrics
35/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-35
Wilcoxon Rank-Sum Test(Large Sample)
For large samples, the test statistic T1 is
approximately normal with mean and
standard deviation :
Must use the normal approximation if either n1or n2> 10
Assign n1to be the smaller of the two sample sizes
Can use the normal approximation for small samples
1T
1T
2
)1n(n
1T1
12
)1n(nn
21T1
-
8/12/2019 Chap11_Chie Square & Non Parametrics
36/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-36
Wilcoxon Rank-Sum Test(Large Sample)
The Z test statistic is
Where Z approximately follows a
standardized normal distribution
1
1
T
T1
T
Z
(continued)
-
8/12/2019 Chap11_Chie Square & Non Parametrics
37/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-37
Wilcoxon Rank-Sum Test:Normal Approximation Example
Use the setting of the prior example:
The sample sizes were:
n1= 4(factory B)
n2= 5(factory A)
The level of significance was = .05
The test statistic was T1= 24.5
-
8/12/2019 Chap11_Chie Square & Non Parametrics
38/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-38
Wilcoxon Rank-Sum Test:Normal Approximation Example
The test statistic is
202
)19(4
2
)1n(n
1T1
739.212
)19()5(4
12
)1n(nn
21T1
(continued)
64.1
2.739
205.24
TZ
1
1
T
T1
Z = 1.64 is not greater than the critical Z value of 1.96
(for = .05) so we do not reject H0there is not
sufficient evidence that the medians are not equal
-
8/12/2019 Chap11_Chie Square & Non Parametrics
39/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-39
Kruskal-Wallis Rank Test
Tests the equality of more than 2 populationmedians
Use when the normality assumption for one-
way ANOVA is violated Assumptions:
The samples are random and independent
variables have a continuous distribution
the data can be ranked populations have the same variability
populations have the same shape
-
8/12/2019 Chap11_Chie Square & Non Parametrics
40/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-40
Kruskal-Wallis Test Procedure
Obtain relative rankings for each value
In event of tie, each of the tied values gets the
average rank
Sum the rankings for data from each of the c
groups
Compute the H test statistic
-
8/12/2019 Chap11_Chie Square & Non Parametrics
41/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-41
Kruskal - Wallis Test Procedure
The Kruskal - Wallis H test statistic:(with c1 degrees of freedom)
)1n(3n
T
)1n(n12H
c
1j j
2
j
where:
n = sum of sample sizes in all samplesc = Number of samples
Tj= Sum of ranks in the jthsample
nj= Size of the jthsample
(continued)
-
8/12/2019 Chap11_Chie Square & Non Parametrics
42/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-42
Decision rule
Reject H0 if test statistic H > 2U
Otherwise do not reject H0
(continued)
Kruskal-Wallis Test Procedure
Complete the test by comparing the
calculated H value to a critical 2valuefrom
the chi-square distribution with c1
degrees of freedom
2U
0
Reject H0Do notreject H0
-
8/12/2019 Chap11_Chie Square & Non Parametrics
43/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-43
Do different departments have different class
sizes?
Kruskal-Wallis Example
Class size
(Math, M)
Class size
(English, E)
Class size
(Biology, B)
23
45
54
78
66
55
60
72
45
70
30
40
18
34
44
-
8/12/2019 Chap11_Chie Square & Non Parametrics
44/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-44
Do different departments have different class
sizes?
Kruskal-Wallis Example
Class size
(Math, M)Ranking
Class size
(English, E)
RankingClass size
(Biology, B)
Ranking
23
41
54
78
66
2
6
9
15
12
55
60
72
45
70
10
11
14
8
13
30
40
18
34
44
3
5
1
4
7
= 44 = 56 = 20
-
8/12/2019 Chap11_Chie Square & Non Parametrics
45/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-45
The H statistic is
(continued)
Kruskal-Wallis Example
72.6)115(35
20
5
56
5
44
)115(15
12
)1n(3n
R
)1n(n
12H
222
c
1j j
2
j
equalareMedianspopulationallotN:H
MedianMedianMedian:H
A
HEM0
-
8/12/2019 Chap11_Chie Square & Non Parametrics
46/47
Statistics for Managers Using Microsoft Excel, 4e 2004 Prentice-Hall, Inc. Chap 11-46
Since H = 6.72
-
8/12/2019 Chap11_Chie Square & Non Parametrics
47/47
Chapter Summary
Developed and applied the 2test for thedifference between two proportions
Developed and applied the 2test for
differences in more than two proportions Examined the 2test for independence
Used the Wilcoxon rank sum test for twopopulation medians Small Samples Large sample Z approximation
Applied the Kruskal - Wallis H-test for multiplepopulation medians