9.1 Comparison of Two Population Means: Large, Independent Samples

Learning Objectives

  1. To understand the logical framework for estimating the difference between the means of two distinct populations and performing tests of hypotheses concerning those means.
  2. To learn how to construct a confidence interval for the difference in the means of two distinct populations using large, independent samples.
  3. To learn how to perform a test of hypotheses concerning the difference between the means of two distinct populations using large, independent samples.

Suppose we wish to compare the means of two distinct populations. Figure 9.1 "Independent Sampling from Two Populations" illustrates the conceptual framework of our investigation in this and the next section. Each population has a mean and a standard deviation. We arbitrarily label one population as Population 1 and the other as Population 2, and subscript the parameters with the numbers 1 and 2 to tell them apart. We draw a random sample from Population 1 and label the sample statistics it yields with the subscript 1. Without reference to the first sample we draw a sample from Population 2 and label its sample statistics with the subscript 2.

Figure 9.1 Independent Sampling from Two Populations

Definition

Samples from two distinct populations are independent if each one is drawn without reference to the other, and has no connection with the other.

Our goal is to use the information in the samples to estimate the difference μ1μ2 in the means of the two populations and to make statistically valid inferences about it.

Confidence Intervals

Since the mean x-1 of the sample drawn from Population 1 is a good estimator of μ1 and the mean x-2 of the sample drawn from Population 2 is a good estimator of μ2, a reasonable point estimate of the difference μ1μ2 is x-1x-2. In order to widen this point estimate into a confidence interval, we first suppose that both samples are large, that is, that both n130 and n230. If so, then the following formula for a confidence interval for μ1μ2 is valid. The symbols s12 and s22 denote the squares of s1 and s2. (In the relatively rare case that both population standard deviations σ1 and σ2 are known they would be used instead of the sample standard deviations.)

100(1α)% Confidence Interval for the Difference Between Two Population Means: Large, Independent Samples

(x-1x-2)±zα2s12n1+s22n2

The samples must be independent, and each sample must be large: n130 and n230.

Example 1

To compare customer satisfaction levels of two competing cable television companies, 174 customers of Company 1 and 355 customers of Company 2 were randomly selected and were asked to rate their cable companies on a five-point scale, with 1 being least satisfied and 5 most satisfied. The survey results are summarized in the following table:

Company 1 Company 2
n1=174 n2=355
x-1=3.51 x-2=3.24
s1=0.51 s2=0.52

Construct a point estimate and a 99% confidence interval for μ1μ2, the difference in average satisfaction levels of customers of the two companies as measured on this five-point scale.

Solution:

The point estimate of μ1μ2 is

x-1x-2=3.513.24=0.27.

In words, we estimate that the average customer satisfaction level for Company 1 is 0.27 points higher on this five-point scale than it is for Company 2.

To apply the formula for the confidence interval, proceed exactly as was done in Chapter 7 "Estimation". The 99% confidence level means that α=10.99=0.01 so that zα2=z0.005. From Figure 12.3 "Critical Values of " we read directly that z0.005=2.576. Thus

(x-1x-2)±zα2s12n1+s22n2=0.27±2.5760.512174+0.522355=0.27±0.12

We are 99% confident that the difference in the population means lies in the interval [0.15,0.39], in the sense that in repeated sampling 99% of all intervals constructed from the sample data in this manner will contain μ1μ2. In the context of the problem we say we are 99% confident that the average level of customer satisfaction for Company 1 is between 0.15 and 0.39 points higher, on this five-point scale, than that for Company 2.

Hypothesis Testing

Hypotheses concerning the relative sizes of the means of two populations are tested using the same critical value and p-value procedures that were used in the case of a single population. All that is needed is to know how to express the null and alternative hypotheses and to know the formula for the standardized test statistic and the distribution that it follows.

The null and alternative hypotheses will always be expressed in terms of the difference of the two population means. Thus the null hypothesis will always be written

H0:μ1μ2=D0

where D0 is a number that is deduced from the statement of the situation. As was the case with a single population the alternative hypothesis can take one of the three forms, with the same terminology:

Form of Ha Terminology
Ha:μ1μ2<D0 Left-tailed
Ha:μ1μ2>D0 Right-tailed
Ha:μ1μ2D0 Two-tailed

As long as the samples are independent and both are large the following formula for the standardized test statistic is valid, and it has the standard normal distribution. (In the relatively rare case that both population standard deviations σ1 and σ2 are known they would be used instead of the sample standard deviations.)

Standardized Test Statistic for Hypothesis Tests Concerning the Difference Between Two Population Means: Large, Independent Samples

Z=(x-1x-2)D0s12n1+s22n2

The test statistic has the standard normal distribution.

The samples must be independent, and each sample must be large: n130 and n230.

Example 2

Refer to Note 9.4 "Example 1" concerning the mean satisfaction levels of customers of two competing cable television companies. Test at the 1% level of significance whether the data provide sufficient evidence to conclude that Company 1 has a higher mean satisfaction rating than does Company 2. Use the critical value approach.

Solution:

  • Step 1. If the mean satisfaction levels μ1 and μ2 are the same then μ1=μ2, but we always express the null hypothesis in terms of the difference between μ1 and μ2, hence H0 is μ1μ2=0. To say that the mean customer satisfaction for Company 1 is higher than that for Company 2 means that μ1>μ2, which in terms of their difference is μ1μ2>0. The test is therefore

    H0:μ1μ2=0 vs. Ha:μ1μ2>0@α=0.01
  • Step 2. Since the samples are independent and both are large the test statistic is

    Z=(x-1x-2)D0s12n1+s22n2
  • Step 3. Inserting the data into the formula for the test statistic gives

    Z=(x-1x-2)D0s12n1+s22n2=(3.513.24)00.512174+0.522355=5.684
  • Step 4. Since the symbol in Ha is “>” this is a right-tailed test, so there is a single critical value, zα=z0.01, which from the last line in Figure 12.3 "Critical Values of " we read off as 2.326. The rejection region is [2.326,).

    Figure 9.2 Rejection Region and Test Statistic for Note 9.6 "Example 2"

  • Step 5. As shown in Figure 9.2 "Rejection Region and Test Statistic for " the test statistic falls in the rejection region. The decision is to reject H0. In the context of the problem our conclusion is:

    The data provide sufficient evidence, at the 1% level of significance, to conclude that the mean customer satisfaction for Company 1 is higher than that for Company 2.

Example 3

Perform the test of Note 9.6 "Example 2" using the p-value approach.

Solution:

The first three steps are identical to those in Note 9.6 "Example 2".

  • Step 4. The observed significance or p-value of the test is the area of the right tail of the standard normal distribution that is cut off by the test statistic Z = 5.684. The number 5.684 is too large to appear in Figure 12.2 "Cumulative Normal Probability", which means that the area of the left tail that it cuts off is 1.0000 to four decimal places. The area that we seek, the area of the right tail, is therefore 11.0000=0.0000 to four decimal places. See Figure 9.3. That is, p -value=0.0000 to four decimal places. (The actual value is approximately 0.000000007.)

Figure 9.3 P-Value for Note 9.7 "Example 3"

  • Step 5. Since 0.0000 < 0.01, p -value<α so the decision is to reject the null hypothesis:

    The data provide sufficient evidence, at the 1% level of significance, to conclude that the mean customer satisfaction for Company 1 is higher than that for Company 2.

Key Takeaways

  • A point estimate for the difference in two population means is simply the difference in the corresponding sample means.
  • In the context of estimating or testing hypotheses concerning two population means, “large” samples means that both samples are large.
  • A confidence interval for the difference in two population means is computed using a formula in the same fashion as was done for a single population mean.
  • The same five-step procedure used to test hypotheses concerning a single population mean is used to test hypotheses concerning the difference between two population means. The only difference is in the formula for the standardized test statistic.

Exercises

    Basic

  1. Construct the confidence interval for μ1μ2 for the level of confidence and the data from independent samples given.

    1. 90% confidence,

      n1=45, x-1=27, s1=2

      n2=60, x-2=22, s2=3

    2. 99% confidence,

      n1=30, x-1=112, s1=9

      n2=40, x-2=98, s2=4

  2. Construct the confidence interval for μ1μ2 for the level of confidence and the data from independent samples given.

    1. 95% confidence,

      n1=110, x-1=77, s1=15

      n2=85, x-2=79, s2=21

    2. 90% confidence,

      n1=65, x-1=83, s1=12

      n2=65, x-2=74, s2=8

  3. Construct the confidence interval for μ1μ2 for the level of confidence and the data from independent samples given.

    1. 99.5% confidence,

      n1=130, x-1=27.2, s1=2.5

      n2=155, x-2=38.8, s2=4.6

    2. 95% confidence,

      n1=68, x-1=215.5, s1=12.3

      n2=84, x-2=287.8, s2=14.1

  4. Construct the confidence interval for μ1μ2 for the level of confidence and the data from independent samples given.

    1. 99.9% confidence,

      n1=275, x-1=70.2, s1=1.5

      n2=325, x-2=63.4, s2=1.1

    2. 90% confidence,

      n1=120, x-1=35.5, s1=0.75

      n2=146, x-2=29.6, s2=0.80

  5. Perform the test of hypotheses indicated, using the data from independent samples given. Use the critical value approach. Compute the p-value of the test as well.

    1. Test H0:μ1μ2=3 vs. Ha:μ1μ23 @ α=0.05,

      n1=35, x-1=25, s1=1

      n2=45, x-2=19, s2=2

    2. Test H0:μ1μ2=25 vs. Ha:μ1μ2<25 @ α=0.10,

      n1=85, x-1=188, s1=15

      n2=62, x-2=215, s2=19

  6. Perform the test of hypotheses indicated, using the data from independent samples given. Use the critical value approach. Compute the p-value of the test as well.

    1. Test H0:μ1μ2=45 vs. Ha:μ1μ2>45 @ α=0.001,

      n1=200, x-1=1312, s1=35

      n2=225, x-2=1256, s2=28

    2. Test H0:μ1μ2=12 vs. Ha:μ1μ212 @ α=0.10,

      n1=35, x-1=121, s1=6

      n2=40, x-2=135, s2=7

  7. Perform the test of hypotheses indicated, using the data from independent samples given. Use the critical value approach. Compute the p-value of the test as well.

    1. Test H0:μ1μ2=0 vs. Ha:μ1μ20 @ α=0.01,

      n1=125, x-1=46, s1=10

      n2=90, x-2=50, s2=13

    2. Test H0:μ1μ2=20 vs. Ha:μ1μ2>20 @ α=0.05,

      n1=40, x-1=142, s1=11

      n2=40, x-2=118, s2=10

  8. Perform the test of hypotheses indicated, using the data from independent samples given. Use the critical value approach. Compute the p-value of the test as well.

    1. Test H0:μ1μ2=13 vs. Ha:μ1μ2<13 @ α=0.01,

      n1=35, x-1=100, s1=2

      n2=35, x-2=88, s2=2

    2. Test H0:μ1μ2=10 vs. Ha:μ1μ210 @ α=0.10,

      n1=146, x-1=62, s1=4

      n2=120, x-2=73, s2=7

  9. Perform the test of hypotheses indicated, using the data from independent samples given. Use the p-value approach.

    1. Test H0:μ1μ2=57 vs. Ha:μ1μ2<57 @ α=0.10,

      n1=117, x-1=1309, s1=42

      n2=133, x-2=1258, s2=37

    2. Test H0:μ1μ2=1.5 vs. Ha:μ1μ21.5 @ α=0.20,

      n1=65, x-1=16.9, s1=1.3

      n2=57, x-2=18.6, s2=1.1

  10. Perform the test of hypotheses indicated, using the data from independent samples given. Use the p-value approach.

    1. Test H0:μ1μ2=10.5 vs. Ha:μ1μ2>10.5 @ α=0.01,

      n1=64, x-1=85.6, s1=2.4

      n2=50, x-2=95.3, s2=3.1

    2. Test H0:μ1μ2=110 vs. Ha:μ1μ2110 @ α=0.02,

      n1=176, x-1=1918, s1=68

      n2=241, x-2=1782, s2=146

  11. Perform the test of hypotheses indicated, using the data from independent samples given. Use the p-value approach.

    1. Test H0:μ1μ2=50 vs. Ha:μ1μ2>50 @ α=0.005,

      n1=72, x-1=272, s1=26

      n2=103, x-2=213, s2=14

    2. Test H0:μ1μ2=7.5 vs. Ha:μ1μ27.5 @ α=0.10,

      n1=52, x-1=94.3, s1=2.6

      n2=38, x-2=88.6, s2=8.0

  12. Perform the test of hypotheses indicated, using the data from independent samples given. Use the p-value approach.

    1. Test H0:μ1μ2=23 vs. Ha:μ1μ2<23 @ α=0.20,

      n1=314, x-1=198, s1=12.2

      n2=220, x-2=176, s2=11.5

    2. Test H0:μ1μ2=4.4 vs. Ha:μ1μ24.4 @ α=0.05,

      n1=32, x-1=40.3, s1=0.5

      n2=30, x-2=35.5, s2=0.7

    Applications

  1. In order to investigate the relationship between mean job tenure in years among workers who have a bachelor’s degree or higher and those who do not, random samples of each type of worker were taken, with the following results.

    n x- s
    Bachelor’s degree or higher 155 5.2 1.3
    No degree 210 5.0 1.5
    1. Construct the 99% confidence interval for the difference in the population means based on these data.
    2. Test, at the 1% level of significance, the claim that mean job tenure among those with higher education is greater than among those without, against the default that there is no difference in the means.
    3. Compute the observed significance of the test.
  2. Records of 40 used passenger cars and 40 used pickup trucks (none used commercially) were randomly selected to investigate whether there was any difference in the mean time in years that they were kept by the original owner before being sold. For cars the mean was 5.3 years with standard deviation 2.2 years. For pickup trucks the mean was 7.1 years with standard deviation 3.0 years.

    1. Construct the 95% confidence interval for the difference in the means based on these data.
    2. Test the hypothesis that there is a difference in the means against the null hypothesis that there is no difference. Use the 1% level of significance.
    3. Compute the observed significance of the test in part (b).
  3. In previous years the average number of patients per hour at a hospital emergency room on weekends exceeded the average on weekdays by 6.3 visits per hour. A hospital administrator believes that the current weekend mean exceeds the weekday mean by fewer than 6.3 hours.

    1. Construct the 99% confidence interval for the difference in the population means based on the following data, derived from a study in which 30 weekend and 30 weekday one-hour periods were randomly selected and the number of new patients in each recorded.

      n x- s
      Weekends 30 13.8 3.1
      Weekdays 30 8.6 2.7
    2. Test at the 5% level of significance whether the current weekend mean exceeds the weekday mean by fewer than 6.3 patients per hour.
    3. Compute the observed significance of the test.
  4. A sociologist surveys 50 randomly selected citizens in each of two countries to compare the mean number of hours of volunteer work done by adults in each. Among the 50 inhabitants of Lilliput, the mean hours of volunteer work per year was 52, with standard deviation 11.8. Among the 50 inhabitants of Blefuscu, the mean number of hours of volunteer work per year was 37, with standard deviation 7.2.

    1. Construct the 99% confidence interval for the difference in mean number of hours volunteered by all residents of Lilliput and the mean number of hours volunteered by all residents of Blefuscu.
    2. Test, at the 1% level of significance, the claim that the mean number of hours volunteered by all residents of Lilliput is more than ten hours greater than the mean number of hours volunteered by all residents of Blefuscu.
    3. Compute the observed significance of the test in part (b).
  5. A university administrator asserted that upperclassmen spend more time studying than underclassmen.

    1. Test this claim against the default that the average number of hours of study per week by the two groups is the same, using the following information based on random samples from each group of students. Test at the 1% level of significance.

      n x- s
      Upperclassmen 35 15.6 2.9
      Underclassmen 35 12.3 4.1
    2. Compute the observed significance of the test.
  6. An kinesiologist claims that the resting heart rate of men aged 18 to 25 who exercise regularly is more than five beats per minute less than that of men who do not exercise regularly. Men in each category were selected at random and their resting heart rates were measured, with the results shown.

    n x- s
    Regular exercise 40 63 1.0
    No regular exercise 30 71 1.2
    1. Perform the relevant test of hypotheses at the 1% level of significance.
    2. Compute the observed significance of the test.
  7. Children in two elementary school classrooms were given two versions of the same test, but with the order of questions arranged from easier to more difficult in Version A and in reverse order in Version B. Randomly selected students from each class were given Version A and the rest Version B. The results are shown in the table.

    n x- s
    Version A 31 83 4.6
    Version B 32 78 4.3
    1. Construct the 90% confidence interval for the difference in the means of the populations of all children taking Version A of such a test and of all children taking Version B of such a test.
    2. Test at the 1% level of significance the hypothesis that the A version of the test is easier than the B version (even though the questions are the same).
    3. Compute the observed significance of the test.
  8. The Municipal Transit Authority wants to know if, on weekdays, more passengers ride the northbound blue line train towards the city center that departs at 8:15 a.m. or the one that departs at 8:30 a.m. The following sample statistics are assembled by the Transit Authority.

    n x- s
    8:15 a.m. train 30 323 41
    8:30 a.m. train 45 356 45
    1. Construct the 90% confidence interval for the difference in the mean number of daily travellers on the 8:15 train and the mean number of daily travellers on the 8:30 train.
    2. Test at the 5% level of significance whether the data provide sufficient evidence to conclude that more passengers ride the 8:30 train.
    3. Compute the observed significance of the test.
  9. In comparing the academic performance of college students who are affiliated with fraternities and those male students who are unaffiliated, a random sample of students was drawn from each of the two populations on a university campus. Summary statistics on the student GPAs are given below.

    n x- s
    Fraternity 645 2.90 0.47
    Unaffiliated 450 2.88 0.42

    Test, at the 5% level of significance, whether the data provide sufficient evidence to conclude that there is a difference in average GPA between the population of fraternity students and the population of unaffiliated male students on this university campus.

  10. In comparing the academic performance of college students who are affiliated with sororities and those female students who are unaffiliated, a random sample of students was drawn from each of the two populations on a university campus. Summary statistics on the student GPAs are given below.

    n x- s
    Sorority 330 3.18 0.37
    Unaffiliated 550 3.12 0.41

    Test, at the 5% level of significance, whether the data provide sufficient evidence to conclude that there is a difference in average GPA between the population of sorority students and the population of unaffiliated female students on this university campus.

  11. The owner of a professional football team believes that the league has become more offense oriented since five years ago. To check his belief, 32 randomly selected games from one year’s schedule were compared to 32 randomly selected games from the schedule five years later. Since more offense produces more points per game, the owner analyzed the following information on points per game (ppg).

    n x- s
    ppg previously 32 20.62 4.17
    ppg recently 32 22.05 4.01

    Test, at the 10% level of significance, whether the data on points per game provide sufficient evidence to conclude that the game has become more offense oriented.

  12. The owner of a professional football team believes that the league has become more offense oriented since five years ago. To check his belief, 32 randomly selected games from one year’s schedule were compared to 32 randomly selected games from the schedule five years later. Since more offense produces more offensive yards per game, the owner analyzed the following information on offensive yards per game (oypg).

    n x- s
    oypg previously 32 316 40
    oypg recently 32 336 35

    Test, at the 10% level of significance, whether the data on offensive yards per game provide sufficient evidence to conclude that the game has become more offense oriented.

    Large Data Set Exercises

  1. Large Data Sets 1A and 1B list the SAT scores for 1,000 randomly selected students. Denote the population of all male students as Population 1 and the population of all female students as Population 2.

    http://www.gone.2012books.lardbucket.org/sites/all/files/data1A.xls

    http://www.gone.2012books.lardbucket.org/sites/all/files/data1B.xls

    1. Restricting attention to just the males, find n1, x-1, and s1. Restricting attention to just the females, find n2, x-2, and s2.
    2. Let μ1 denote the mean SAT score for all males and μ2 the mean SAT score for all females. Use the results of part (a) to construct a 90% confidence interval for the difference μ1μ2.
    3. Test, at the 5% level of significance, the hypothesis that the mean SAT scores among males exceeds that of females.
  2. Large Data Sets 1A and 1B list the GPAs for 1,000 randomly selected students. Denote the population of all male students as Population 1 and the population of all female students as Population 2.

    http://www.gone.2012books.lardbucket.org/sites/all/files/data1A.xls

    http://www.gone.2012books.lardbucket.org/sites/all/files/data1B.xls

    1. Restricting attention to just the males, find n1, x-1, and s1. Restricting attention to just the females, find n2, x-2, and s2.
    2. Let μ1 denote the mean GPA for all males and μ2 the mean GPA for all females. Use the results of part (a) to construct a 95% confidence interval for the difference μ1μ2.
    3. Test, at the 10% level of significance, the hypothesis that the mean GPAs among males and females differ.
  3. Large Data Sets 7A and 7B list the survival times for 65 male and 75 female laboratory mice with thymic leukemia. Denote the population of all such male mice as Population 1 and the population of all such female mice as Population 2.

    http://www.gone.2012books.lardbucket.org/sites/all/files/data7A.xls

    http://www.gone.2012books.lardbucket.org/sites/all/files/data7B.xls

    1. Restricting attention to just the males, find n1, x-1, and s1. Restricting attention to just the females, find n2, x-2, and s2.
    2. Let μ1 denote the mean survival for all males and μ2 the mean survival time for all females. Use the results of part (a) to construct a 99% confidence interval for the difference μ1μ2.
    3. Test, at the 1% level of significance, the hypothesis that the mean survival time for males exceeds that for females by more than 182 days (half a year).
    4. Compute the observed significance of the test in part (c).

Answers

    1. (4.20,5.80),
    2. (18.54,9.46)
    1. (12.81,10.39),
    2. (76.50,68.10)
    1. Z = 8.753, ±z0.025=±1.960, reject H0, p-value = 0.0000;
    2. Z=0.687, z0.10=1.282, do not reject H0, p-value = 0.2451
    1. Z = 2.444, ±z0.005=±2.576, do not reject H0, p-value = 0.0146.
    2. Z = 1.702, z0.05=1.645, reject H0, p-value = 0.0446
    1. Z=1.19, p-value = 0.1170, do not reject H0;
    2. Z=0.92, p-value = 0.3576, do not reject H0
    1. Z = 2.68, p-value = 0.0037, reject H0;
    2. Z=1.34, p-value = 0.1802, do not reject H0
    1. 0.2±0.4,
    2. Z = 1.360, z0.01=2.326, do not reject H0 (not greater)
    3. p-value = 0.0869
    1. 5.2±1.9,
    2. Z=1.466, z0.050=1.645, do not reject H0 (exceeds by 6.3 or more)
    3. p-value = 0.0708
    1. Z = 3.888, z0.01=2.326, reject H0 (upperclassmen study more)
    2. p-value = 0.0001
    1. 5±1.8,
    2. Z = 4.454, z0.01=2.326, reject H0 (Test A is easier)
    3. p-value = 0.0000
  1. Z = 0.738, ±z0.025=±1.960, do not reject H0 (no difference)

  2. Z=1.398, z0.10=1.282, reject H0 (more offense oriented)

    1. n1=419, x-1=1540.33, s1=205.40, n2=581, x-2=1520.38, and s2=217.34.
    2. (2.24,42.15)
    3. H0:μ1μ2=0 vs. Ha:μ1μ2>0. Test Statistic: Z = 1.48. Rejection Region: [1.645,). Decision: Fail to reject H0.
    1. n1=65, x-1=665.97, s1=41.60, n2=75, x-2=455.89, and s2=63.22.
    2. (187.06,233.09)
    3. H0:μ1μ2=182 vs. Ha:μ1μ2>182. Test Statistic: Z = 3.14. Rejection Region: [2.33,). Decision: Reject H0.
    4. pvalue=0.0008