Estimation methods considered in section 2.2 give us a point estimate of a parameter, say л, and that is the best bet, given the data and the estimation method, of what л might be. But it is always good policy to give the client an interval, rather than a point estimate, where with some degree of confidence, usually 95% confidence, we expect л to lie. We have seen in Figure
2.5 that for a N(0,1) random variable z, we have
Pr[ za/2 < z < za/2] = 1 – a
and for a = 5%, this probability is 0.95, giving the required 95% confidence. In fact, za/2 = 1.96 and
Pr[-1.96 < z < 1.96] = 0.95
This says that if we draw 100 random numbers from a N(0,1) density, (using a normal random number generator) we expect 95 out of these 100 numbers to lie in the [-1.96,1.96] interval. Now, let us get back to the problem of estimating л from a random sample X1,…,xn drawn from a N(л, a2) distribution. We found out that JiMLE = X and X ~ N(л, а2/n). Hence, z = (X — ц)/(а/у/гі) is N(0,1). The point estimate for л is X observed from the sample, and the 95% confidence interval for л is obtained by replacing z by its value in the above probability statement:
Pr[—za/2 < X-—~^ < za/2] = 1 — a
‘ a/ n 1
Assuming a is known for the moment, one can rewrite this probability statement after some simple algebraic manipulations as
Pr[X – z«/2(a/Vn) < л < X + z«/2(a/%n)] = 1 – a
Note that this probability statement has random variables on both ends and the probability that these random variables sandwich the unknown parameter /л is 1 – a. With the same confidence of drawing 100 random N(0,1) numbers and finding 95 of them falling in the (-1.96,1.96) range we are confident that if we drew a 100 samples and computed a 100 X’s, and a 100 intervals (X ± 1.96 аД/й), л will lie in these intervals in 95 out of 100 times.
If a is not known, and is replaced by s, then problem 12 shows that this is equivalent to dividing a N(0,1) random variable by an independent хП-1 random variable divided by its degrees of freedom, leading to a f-distribution with (n – 1) degrees of freedom. Hence, using the f-tables for (n – 1) degrees of freedom
Pr[ fa/2;n-1 < fn— 1 < fa/2;n-l] 1 a
and replacing fn-1 by (X – ji)/(s^/n) one gets
Pr[X – fa/2;n-l(s//n) < Л < X + fa/2;n-1 (s/Vn)] = 1 – a
Note that the degrees of freedom (n-1) for the f-distribution come from s and the corresponding critical value fn-1;a/2 is therefore sample specific, unlike the corresponding case for the normal density where za/2 does not depend on n. For small n, the ta/2 values differ drastically from
Table 2.1 Descriptive Statistics for the Earnings Data
Sample: 1 595
za/2, emphasizing the importance of using the t-density in small samples. When n is large the difference between za/2 and ta/2 diminishes as the t-density becomes more like a normal density. For n = 20, and a = 0.05,ta/2;n-1 = 2.093 as compared with za/2 = 1.96. Therefore,
Pr[—2.093 < tn-i < 2.093] = 0.95
and л lies in X ± 2.093(s/ n) with 95% confidence.
More examples of confidence intervals can be constructed, but the idea should be clear. Note that these confidence intervals are the other side of the coin for tests of hypotheses. For example, in testing H0; л = 2 versus H1; л = 2 for a known a, we discovered that the Likelihood Ratio test is based on the same probability statement that generated the confidence interval for л. In classical tests of hypothesis, we choose the level of confidence a = 5% and compute z = (X — л)/(аД/п). This can be done since a is known and л = 2 under the null hypothesis H0. Next, we do not reject H0 if z lies in the (—za/2, za/2) interval and reject H0 otherwise. For confidence intervals, on the other hand, we do not know л, and armed with a level of confidence (1 — a)% we construct the interval that should contain л with that level of confidence. Having done that, if л = 2 lies in that 95% confidence interval, then we cannot reject H0; л = 2 at the 5% level. Otherwise, we reject H0. This highlights the fact that any value of л that lies in this 95% confidence interval (assuming it was our null hypothesis) cannot be rejected at the 5% level by this sample. This is why we do not say “accept H0”, but rather we say “do not reject H0”.