Activity 23-4: Students' Credit (cont.)
(a)-(c) Answers will vary.
(d) Should agree
(e) Observational study
Activity 23-5: Age and Political Ideology (cont.)
(a) Let q <30 = proportion of all under 30 people who consider themselves liberal, let q >50=proportion of all over 50 people who consider themselves liberal.
H0: q <30 = q >50 (no differences in the two age groups)
Ha: q <30 ¹ q >50 (proportion of liberals differ in the two age groups)
z = (.2804 - .1502)/sqrt(.1939(1-.1939)(1/296+1/586)) = 4.62
p-value = 2 Pr(Z>|4.62|) = essentially zero.
This small p-value provides strong evidence that the proportion of liberals differs in the two age groups.
(c) We have evidence of a significant difference in the proportion who consider themselves liberal in the two age groups. In fact, we are 95% confident that q <30 - q >50 is between 7% and 19%, meaning there are between 7% and 19% more people in the under 30 group who consider themselves liberal.
(a) This is an observational study since the experimenter did not assign the patients to the BAP group and the control group. It is a case-control study as we examine those with and without the disease and then look back into their histories.
(b) (The sample sizes are large enough for the following to be valid,
and we are assuming independently selected samples, e.g. male and non-male)
| Z | p-value(2-sided) | |||||||||
| Male | .875 | .8936 | .8873 | -.332 | .74 | |||||
| White | .8958 | .9468 | .9296 | -1.123 | .2614 | |||||
| Non-Hispanic | .7919 | .7979 | .7958 | -.087 | .9309 | |||||
| AIDS | .5000 | .4681 | .4789 | .360 | .7188 | |||||
| Own Cat | .6667 | .3936 | .4859 | 3.080 | .0021 | |||||
| Cat Scratch | .6250 | .3085 | .4155 | 3.620 | .0003 | |||||
| Cat Bite | .4375 | .1489 | .2465 | 3.774 | .0002 | |||||
(c) No, since this was not an experiment. Also need to be cautious when running multiple significance tests on the same data set...
Activity 23-7: Baldness and Heart Disease (cont.)
(a) Heart Disease: Some or more = 247/663 = .3726
Control Group: 220/772 = .2850
(b) Let q heart = proportion of all heart attack patients with at least some baldness. Let q control = proportion of non-heart attack people with at least some baldness.
Ha: q heart ¹ q control (these proportions differ)
z = (.3726 - .2850)/sqrt(.3254(1-.3254)(1/663+1/772)) = 3.53
p-value = 2Pr(Z>|3.53|) < 2(.0002) = .0004
(c) 97.5% c.i. for q heart - q control: .3726 - .2850 ± 2.24 sqrt(.3726(1-.3726)/663 + .2850(1-.2850)/772) = .0876 ± .0556 = (.032,.1432). We are 97.5% confident that a higher percentage of men (3% to 14% more) who have had heart attacks consider themselves as having at least some baldness.
(d) There appears to be an association, but we can not say the heart attacks caused the baldness since this was an observational study and not an experiment.
Activity 23-8: Sex on Television
(a) 1981: 6/47 = .1277; 1991: 44/615=.0715
(b) Let q 1981 = proportion of all 1981 sexual references which describe married sex. Let q 1991 = proportion of all 1991 sexual references which described married sex.
Ha: q 1981 > q 1991 (Proportion decreased in 1991)
z = (.1277 - .0715)/sqrt(.0755(1-.0755)(1/47+1/615)) = 1.41
p-value = Pr(Z>1.41) = .0793
(c) 99% c.i. for q 1981 - q 1991: .1277 - .0715 ± 2.576 sqrt(.1277(1-.1277)/47 + .0715(1-.0715)/615) = .0562 ± .1282 = (-.072 , .1844).
(d) 95% c.i. for q : .0715 ± 1.96 sqrt(.0715(1-.0715)/615) = .0715 ± .0203 = (.0511, .0919)
Activity 23-9: Heart By-Pass Surgery
(a) Central: 125/3676 = .034; Southeast: 288/6313 = .0456; Western: 167/4906=.034.
The Southeast region had the highest death rate. The central and western regions had lower death rates.
(b) (The sample sizes are large enough for the following to be valid,
and we are assuming independently selected samples.)
| Central vs. SE: | z=-2.812 | p-value=.0049 |
| Central vs. Western: | z=-.009 | p-value=.9928 |
| SE vs. Western: | z=3.084 | p-value=.002 |
(c) Condition of patients prior to operation.
Activity 23-10: Employment Discrimination
(a) Blacks: 26/48 = .5417; Whites: 206/256=.8047
(b) Let q Black = proportion of all black applicants who would pass the test. Let q White = proportion of all white applicants who would pass the test.
H0: q Black = q White (Blacks and Whites pass at the same rate)
HA: q Black < q White (The proportion of Blacks who pass the test is smaller than the proportion of whites)
(The sample sizes are large enough for the following to be valid, and we are assuming independently selected samples.)
C = (26+206)/(48+256)
= .7632
z = (.5417-.8047)/sqrt(.7632(1-.7632)(1/48+1/256)) = -3.93
p-value = Pr(Z<-3.93) < .0002
We have strong evidence that the proportion of black applicants passing the tests is significantly lower than the proportion of white applicants passing the test (at the .05 and .01 level). That is, if they were passing at the same rate, we would see a difference this big by chance alone in less than .02% of samples from this population.
Activity 23-11: Campus Alcohol Habits (cont.)
1982: Fights after drinking: 502/4324 = .1161; Law due to drinking: 190/4324 = .0439
1991: Fights after drinking: 657/3820 = .1720; Law due to drinking: 290/3820 = .0759
These samples sizes are large enough to apply the test of two proportions and we're assuming the 1982 and 1991 samples were independently selected.
Let q 82F = proportion of a all college students who got into a fight after drinking in 1982. Let q 91F = proportion of all 1991 college students who got into a fight after drinking.
HA: q 82F ¹ q 91F (is a difference in the proportion getting into a fight in the two years)
C = (502+657)/(4324+3820)
= .1423
z = (.1161 - .1720) /sqrt(.1423(1-.1423)(1/4324+1/3820)) = -7.21
p-value = 2 P(rZ>|-7.21|) = essentially zero.
95% c.i. for q 82F - q 92F: (-.071, -.041)
HA: q 82L ¹ q 91L (is a difference in the proportion getting into law trouble in the two years)
C = (190+290)/(4324+3820)
= .0589
z = (.0439 - .0759) / sqrt(.0589(1-.0589)(1/4324+1/3820)) = -5.12
p-value = 2 Pr(Z>|-5.12|) = essentially zero.
95% c.i. for q 82F - q 92F: (-.082, -.022)
(a) Sample size
(b) n1=50, n2 =50 (4% of 50 = 2 daughters, 26% of 50 = 13 daughters)
z = (.04 - .26)/sqrt(.15(.85)(2/50)) = -3.08
p-value = 2Pr(Z>|-3.08|) = 2(.001) = .002
z = (.04-.26)/sqrt(.084(1-.084)(1/200+1/50)) = -22.58
p-value essentially zero.
z = (.04-.26)/sqrt(.15(1-.15)(1/200+1/200)) = -6.16
p-value essentially zero.
(f) No, since not an experiment, can't conclude causation, there are lots of other potential confounding factors.
Activity 23-13: Kids' Smoking (cont.)
(a) n1=60, n2=60 (9 sons of non smokers, 12 sons of smokers)
z = (.15-.20)/sqrt(.175(1-.175)(2/60)) = -.72
p-value = 2Pr(Z>|-.72|) = 2(.2358) = .4716
z = (.15-.20)/sqrt(.175(1-.175)(2/200)) = -1.32
p-value = 2Pr(Z>|-1.32|) = 2(.0934) = .1868
z = (.15-.20)/sqrt(.175(1-.175)(2/500)) = -2.08
p-value = 2Pr(Z>|-2.08|) = 2(.0188) = .0376
1.96sqrt(.175(1-.175)2)/(-.05) = sqrt(n)
n = 443.7 so need 444 in each group (concurs with (b) and (c), needed to be between 200 and 500)
(a) Let q new = proportion of all patients with the new treatment who recover. Let q Old = proportion of all patients with the old treatment who recover.
HA: q new > q Old (a higher proportion of patients with the new treatment recover)
C = (.867(50000) +
.873(50000))/(50000+50000) = .87
z = (.867 - .873)/sqrt(.87(.13)(2/50000)) = -2.82
p-value = .0024
(b) 99% c.i. for q new > q Old: .867-.873 ± 2.576 sqrt(.867(1-.867)/50000 + .873(1-.873)/50000) = -.006 ± .0055 = (-.0115, -.0005)
(c) While we have a statistically significant result (at the .01 level) indicating that a higher proportion of patients recover with the new treatment, the confidence interval tells us that the improvement is only a difference .05% to .06% of patients, not a very significant result in a practical sense.
Activity 23-15: Comparing Proportions of Personal Interest
Answers will vary.