# Confidence Intervals

Estimation methods considered in section 2.2 give us a point estimate of a parameter, say л, and that is the best bet, given the data and the estimation method, of what л might be. But it is always good policy to give the client an interval, rather than a point estimate, where with some degree of confidence, usually 95% confidence, we expect л to lie. We have seen in Figure

2.5 that for a N(0,1) random variable z, we have

Pr[ za/2 < z < za/2] = 1 – a

and for a = 5%, this probability is 0.95, giving the required 95% confidence. In fact, za/2 = 1.96 and

Pr[-1.96 < z < 1.96] = 0.95

This says that if we draw 100 random numbers from a N(0,1) density, (using a normal random number generator) we expect 95 out of these 100 numbers to lie in the [-1.96,1.96] interval. Now, let us get back to the problem of estimating л from a random sample X1,…,xn drawn from a N(л, a2) distribution. We found out that JiMLE = X and X ~ N(л, а2/n). Hence, z = (X — ц)/(а/у/гі) is N(0,1). The point estimate for л is X observed from the sample, and the 95% confidence interval for л is obtained by replacing z by its value in the above probability statement:

Pr[—za/2 < X-—~^ < za/2] = 1 — a

‘ a/ n 1

Assuming a is known for the moment, one can rewrite this probability statement after some simple algebraic manipulations as

Pr[X – z«/2(a/Vn) < л < X + z«/2(a/%n)] = 1 – a

Note that this probability statement has random variables on both ends and the probability that these random variables sandwich the unknown parameter /л is 1 – a. With the same confidence of drawing 100 random N(0,1) numbers and finding 95 of them falling in the (-1.96,1.96) range we are confident that if we drew a 100 samples and computed a 100 X’s, and a 100 intervals (X ± 1.96 аД/й), л will lie in these intervals in 95 out of 100 times.

If a is not known, and is replaced by s, then problem 12 shows that this is equivalent to dividing a N(0,1) random variable by an independent хП-1 random variable divided by its degrees of freedom, leading to a f-distribution with (n – 1) degrees of freedom. Hence, using the f-tables for (n – 1) degrees of freedom

Pr[ fa/2;n-1 < fn— 1 < fa/2;n-l] 1 a

and replacing fn-1 by (X – ji)/(s^/n) one gets

Pr[X – fa/2;n-l(s//n) < Л < X + fa/2;n-1 (s/Vn)] = 1 – a

Note that the degrees of freedom (n-1) for the f-distribution come from s and the corresponding critical value fn-1;a/2 is therefore sample specific, unlike the corresponding case for the normal density where za/2 does not depend on n. For small n, the ta/2 values differ drastically from

Table 2.1 Descriptive Statistics for the Earnings Data

Sample: 1 595

 LWAGE WKS ED EX MS FEM BLK UNION Mean 6.9507 46.4520 12.8450 22.8540 0.8050 0.1126 0.0723 0.3664 Median 6.9847 48.0000 12.0000 21.0000 1.0000 0.0000 0.0000 0.0000 Maximum 8.5370 52.0000 17.0000 51.0000 1.0000 1.0000 1.0000 1.0000 Minimum 5.6768 5.0000 4.0000 7.0000 0.0000 0.0000 0.0000 0.0000 Std. Dev. 0.4384 5.1850 2.7900 10.7900 0.3965 0.3164 0.2592 0.4822 Skewness -0.1140 -2.7309 -0.2581 0.4208 -1.5400 2.4510 3.3038 0.5546 Kurtosis 3.3937 13.7780 2.7127 2.0086 3.3715 7.0075 11.9150 1.3076 Jarque-Bera 5.13 3619.40 8.65 41.93 238.59 993.90 3052.80 101.51 Probability 0.0769 0.0000 0.0132 0.0000 0.0000 0.0000 0.0000 0.0000 Observations 595 595 595 595 595 595 595 595

za/2, emphasizing the importance of using the t-density in small samples. When n is large the difference between za/2 and ta/2 diminishes as the t-density becomes more like a normal density. For n = 20, and a = 0.05,ta/2;n-1 = 2.093 as compared with za/2 = 1.96. Therefore,

Pr[—2.093 < tn-i < 2.093] = 0.95

and л lies in X ± 2.093(s/ n) with 95% confidence.

More examples of confidence intervals can be constructed, but the idea should be clear. Note that these confidence intervals are the other side of the coin for tests of hypotheses. For example, in testing H0; л = 2 versus H1; л = 2 for a known a, we discovered that the Likelihood Ratio test is based on the same probability statement that generated the confidence interval for л. In classical tests of hypothesis, we choose the level of confidence a = 5% and compute z = (X — л)/(аД/п). This can be done since a is known and л = 2 under the null hypothesis H0. Next, we do not reject H0 if z lies in the (—za/2, za/2) interval and reject H0 otherwise. For confidence intervals, on the other hand, we do not know л, and armed with a level of confidence (1 — a)% we construct the interval that should contain л with that level of confidence. Having done that, if л = 2 lies in that 95% confidence interval, then we cannot reject H0; л = 2 at the 5% level. Otherwise, we reject H0. This highlights the fact that any value of л that lies in this 95% confidence interval (assuming it was our null hypothesis) cannot be rejected at the 5% level by this sample. This is why we do not say “accept H0”, but rather we say “do not reject H0”.