Lorem ipsum dolor sit amet, consectetur adipisicing elit. Odit molestiae mollitia laudantium assumenda nam eaque, excepturi, soluta, perspiciatis cupiditate sapiente, adipisci quaerat odio voluptates consectetur nulla eveniet iure vitae quibusdam? Excepturi aliquam in iure, repellat, fugiat illum voluptate repellendus blanditiis veritatis ducimus ad ipsa quisquam, commodi vel necessitatibus, harum quos a dignissimos.
Close Save changesHelp F1 or ? Previous Page ← + CTRL (Windows) ← + ⌘ (Mac) Next Page → + CTRL (Windows) → + ⌘ (Mac) Search Site CTRL + SHIFT + F (Windows) ⌘ + ⇧ + F (Mac) Close Message ESC
Okay, we finally tackle the probability distribution (also known as the "sampling distribution") of the sample mean when \(X_1, X_2, \ldots, X_n\) are a random sample from a normal population with mean \(\mu\) and variance \(\sigma^2\). The word "tackle" is probably not the right choice of word, because the result follows quite easily from the previous theorem, as stated in the following corollary.
If \(X_1, X_2, \ldots, X_n\) are observations of a random sample of size \(n\) from a \(N(\mu, \sigma^2)\) population, then the sample mean:
is normally distributed with mean \(\mu\) and variance \(\frac\). That is, the probability distribution of the sample mean is:
The result follows directly from the previous theorem. All we need to do is recognize that the sample mean:
is a linear combination of independent normal random variables:
with \(c_i=\frac\), the mean \(\mu_i=\mu\) and the variance \(\sigma^2_i=\sigma^2\). That is, the moment generating function of the sample mean is then:
\(M_>(t)=\text\left[t\left(\sum\limits_^n c_i \mu_i\right)+\dfrac\left(\sum\limits_^n c^2_i \sigma^2_i\right)\right]=\text\left[t\left(\sum\limits_^n \dfrac\mu\right)+\dfrac\left(\sum\limits_^n \left(\dfrac\right)^2\sigma^2\right)\right]\)
The first equality comes from the theorem on the previous page, about the distribution of a linear combination of independent normal random variables. The second equality comes from simply replacing \(c_i\) with \(\frac\), the mean \(\mu_i\) with \(\mu\) and the variance \(\sigma^2_i\) with \(\sigma^2\). Now, working on the summations, the moment generating function of the sample mean reduces to:
\(M_>(t)=\text\left[t\left(\dfrac \sum\limits_^n \mu\right)+\dfrac\left(\dfrac\sum\limits_^n \sigma^2\right)\right]=\text\left[t\left(\dfrac(n\mu)\right)+\dfrac\left(\dfrac(n\sigma^2)\right)\right]=\text\left[\mu t +\dfrac \left(\dfrac\right)\right]\)
The first equality comes from pulling the constants depending on \(n\) through the summation signs. The second equality comes from adding \(\mu\) up \(n\) times to get \(n\mu\), and adding \(\sigma^2\) up \(n\) times to get \(n\sigma^2\). The last equality comes from simplifying a bit more. In summary, we have shown that the moment generating function of the sample mean of \(n\) independent normal random variables with mean \(\mu\) and variance \(\sigma^2\) is:
That is the same as the moment generating function of a normal random variable with mean \(\mu\) and variance \(\frac\). Therefore, the uniqueness property of moment-generating functions tells us that the sample mean must be normally distributed with mean \(\mu\) and variance \(\frac\). Our proof is complete.
Let \(X_i\) denote the Stanford-Binet Intelligence Quotient (IQ) of a randomly selected individual, \(i=1, \ldots, 4\) (one sample). Let \(Y_i\) denote the IQ of a randomly selected individual, \(i=1, \ldots, 8\) (a second sample). Recalling that IQs are normally distributed with mean \(\mu=100\) and variance \(\sigma^2=16^2\), what is the distribution of \(\bar\)? And, what is the distribution of \(\bar\)?
In general, the variance of the sample mean is:
Therefore, the variance of the sample mean of the first sample is:
(The subscript 4 is there just to remind us that the sample mean is based on a sample of size 4.) And, the variance of the sample mean of the second sample is:
(The subscript 8 is there just to remind us that the sample mean is based on a sample of size 8.) Now, the corollary therefore tells us that the sample mean of the first sample is normally distributed with mean 100 and variance 64. That is:
And, the sample mean of the second sample is normally distributed with mean 100 and variance 32. That is:
So, we have two, no actually, three normal random variables with the same mean, but difference variances:
It is quite informative to graph these three distributions on the same plot. Doing so, we get:
As the plot suggests, an individual \(X_i\), the mean (\bar_4\) and the mean \(\bar_8\) all provide valid, "unbiased" estimates of the population mean \(\mu\). But, our intuition coincides with reality. that is, the sample mean \(\bar_8\) will be the most precise estimate of \(\mu\).
All the work that we have done so far concerning this example has been theoretical in nature. That is, what we have learned is based on probability theory. Would we see the same kind of result if we were take to a large number of samples, say 1000, of size 4 and 8, and calculate the sample mean of each sample? That is, would the distribution of the 1000 sample means based on a sample of size 4 look like a normal distribution with mean 100 and variance 64? And would the distribution of the 1000 sample means based on a sample of size 8 look like a normal distribution with mean 100 and variance 32? Well, the only way to answer these questions is to try it out!
I did just that for us. I used Minitab to generate 1000 samples of eight random numbers from a normal distribution with mean 100 and variance 256. Here's a subset of the resulting random numbers:
ROW | X1 | X2 | X3 | X4 | X5 | X6 | X7 | X8 | Mean 4 | Mean 8 |
1 | 87 | 68 | 98 | 114 | 59 | 111 | 114 | 86 | 91.75 | 92.125 |
2 | 102 | 81 | 74 | 110 | 112 | 106 | 105 | 99 | 91.75 | 98.625 |
3 | 96 | 87 | 50 | 88 | 69 | 107 | 94 | 83 | 80.25 | 84.250 |
4 | 83 | 134 | 122 | 80 | 117 | 110 | 115 | 158 | 104.75 | 114.875 |
5 | 92 | 87 | 120 | 93 | 90 | 111 | 95 | 92 | 98.00 | 97.500 |
6 | 139 | 102 | 100 | 103 | 111 | 62 | 78 | 73 | 111.00 | 96.000 |
7 | 134 | 121 | 99 | 118 | 108 | 106 | 103 | 91 | 118.00 | 110.000 |
8 | 126 | 92 | 148 | 131 | 99 | 106 | 143 | 128 | 124.25 | 121.625 |
9 | 98 | 109 | 119 | 110 | 124 | 99 | 119 | 82 | 109.00 | 107.500 |
10 | 85 | 93 | 82 | 106 | 93 | 109 | 100 | 95 | 91.50 | 95.375 |
11 | 121 | 103 | 108 | 96 | 112 | 117 | 93 | 112 | 107.00 | 107.750 |
12 | 118 | 91 | 106 | 108 | 128 | 96 | 65 | 85 | 105.75 | 99.625 |
13 | 92 | 87 | 96 | 81 | 86 | 105 | 91 | 104 | 89.00 | 92.750 |
14 | 94 | 115 | 59 | 105 | 101 | 122 | 97 | 103 | 93.25 | 99.500 |
. and so on. | ||||||||||
975 | 108 | 139 | 130 | 97 | 138 | 88 | 104 | 87 | 118.50 | 111.375 |
976 | 99 | 122 | 93 | 107 | 98 | 62 | 102 | 115 | 105.25 | 99.750 |
977 | 99 | 127 | 91 | 101 | 127 | 79 | 81 | 121 | 104.50 | 103.250 |
978 | 120 | 108 | 101 | 104 | 90 | 90 | 191 | 104 | 108.25 | 101.000 |
979 | 101 | 93 | 106 | 113 | 115 | 82 | 96 | 97 | 103.25 | 100.375 |
980 | 118 | 86 | 74 | 95 | 109 | 111 | 90 | 83 | 93.25 | 95.750 |
981 | 118 | 95 | 121 | 124 | 111 | 90 | 105 | 112 | 114.50 | 109.500 |
982 | 110 | 121 | 85 | 117 | 91 | 84 | 84 | 108 | 108.25 | 100.000 |
983 | 95 | 109 | 118 | 112 | 121 | 105 | 84 | 115 | 108.50 | 107.375 |
984 | 102 | 105 | 127 | 104 | 95 | 101 | 106 | 103 | 109.50 | 105.375 |
985 | 116 | 93 | 112 | 102 | 67 | 92 | 103 | 114 | 105.75 | 99.875 |
986 | 106 | 97 | 114 | 82 | 82 | 108 | 113 | 81 | 99.75 | 97.875 |
987 | 107 | 93 | 78 | 91 | 83 | 81 | 115 | 102 | 92.25 | 93.750 |
988 | 106 | 115 | 105 | 74 | 86 | 124 | 97 | 116 | 100.00 | 102.875 |
989 | 117 | 84 | 131 | 102 | 92 | 118 | 90 | 90 | 108.50 | 103.000 |
990 | 100 | 69 | 108 | 128 | 111 | 110 | 94 | 95 | 101.25 | 101.875 |
991 | 86 | 85 | 123 | 94 | 104 | 89 | 76 | 97 | 97.00 | 94.250 |
992 | 94 | 90 | 72 | 121 | 105 | 150 | 72 | 88 | 94.25 | 99.000 |
993 | 70 | 109 | 104 | 114 | 93 | 103 | 126 | 99 | 99.25 | 102.250 |
994 | 102 | 110 | 98 | 93 | 64 | 131 | 91 | 95 | 100.75 | 98.000 |
995 | 80 | 135 | 120 | 92 | 118 | 119 | 66 | 117 | 106.75 | 105.875 |
996 | 81 | 102 | 88 | 98 | 113 | 81 | 95 | 110 | 92.25 | 96.000 |
997 | 85 | 146 | 73 | 133 | 111 | 88 | 92 | 74 | 109.25 | 100.250 |
998 | 94 | 109 | 110 | 115 | 95 | 93 | 90 | 103 | 107.00 | 101.125 |
999 | 84 | 84 | 97 | 125 | 92 | 89 | 95 | 124 | 97.50 | 98.750 |
1000 | 77 | 60 | 113 | 106 | 107 | 109 | 110 | 103 | 89.00 | 98.125 |
As you can see, the second last column, titled Mean4, is the average of the first four columns X1 X2, X3, and X4. The last column, titled Mean8, is the average of the first eight columns X1, X2, X3, X4, X5, X6, X7, and X8. Now, all we have to do is create a histogram of the sample means appearing in the Mean4 column: