Confidence Intervals

Because estimators are approximations of unknown parameters, the question of how close they are arises. I will answer this question for the sample mean and the sample variance in the case of a random sample X1, X2,…, Xn from the N(д, a 2) distribution.

It is almost trivial that X ~ N(a, a2/n); hence,

Vn(X – n)/a – N(0, 1). (5.19)

Therefore, for given а є (0, 1) there exists a в > 0 such that

P[|X – A < Pa/Vn] = P[Vn(X – A)/a < в]

в

f exp(-u2/2)

= m___ / ) du = 1 – a. (5.20)

J л/2л

For example, if we choose a = 0.05, then в = 1.96 (see Appendix IV, Table

IV. 3), and thus in this case

P[X – 1.96a/Vn < a < X + 1.96a / Vn] = 0.95.

The interval [XT – 1.96a/Vn, XT + 1.96a/Vn] is called the 95% confidence interval of a. If a is known, then this interval can be computed and will tell us how close X and a are with a margin of error of 5%. But in general a is not known, so how do we proceed then?

To solve this problem, we need the following corollary of Theorem 5.7:

Theorem5.14: LetX1, X2,…, Xn be a random sample from the N (a, a 2) dis­tribution. Then the sample mean X and the sample variance S2 are independent.

Proof: Observe, for instance, that X* = ((Xi – a)/a, (X2 – A)/a,… (Xn – A)/a)T — Nn(0, In), X = a + (a/n, …, a/n)X* = b + BX*, and

 /(X1 – X)/a (Xn – X)/a)

 (

 1

 1 X* = CX.

 1 The latter expression implies that (n – 1)S2/a2 = XTCTCX* = XTC2X* = XTCX* because C is symmetric and idempotent with rank(C) = trace(C) = n – 1.Therefore, by Theorem5.7, the sample meanand the sample variance are independent if BC = 0, which in the present case is equivalent to the condition CBT = 0. The latter is easily verified as follows:

 1 1 1 a 1 1 1 I – – (1…….. 1) n n n n
 CBT = – n

Q. E.D.

It follows now from (5.19), Theorems 5.13 and 5.14, and the definition of the Student’s t distribution that

Theorem 5.15: Under the conditions of Theorem 5.14, *fn(X – m)/S ~ tn-1.   Recall from Chapter 4 that the tn-1 distribution has density

where r(y) = J0° xy 1 exp(-x)dX’ y > 0. Thus, as in (5.20), for each а є (0’ 1) and sample size n there exists a > 0 such that

fin  P[IX – Ml <PnS/Vn] = f

-fin

hence, [X – pnS/*fn’ X + enS^fn] is now the (1 – a) x 100% confidence interval of m.

Similarly, on the basis of Theorem 5.13 we can construct confidence intervals of a2. Recall from Chapter 4 that the x2-1 distribution has density x1)/2 1 exp(-x /2) Г((п – 1)/2)2(n-1)/2 ‘

For a given а є (0’ 1) and sample size n we can choose p1n < p2n such that

P [(n – 1)S2/^2,n < a2 < (n – 1)S2/eyn]

= P [Pu < (n – 1)S2/a2 < fan ]

lh, n

= I gn-1(u)du = 1 – a. (5.23)

fan    There are different ways to choose p1n and p2n such that the last equality in (5.23) holds. Clearly, the optimal choice is such that P – – Pin is minimal because it will yield the smallest confidence interval, but that is computationally complicated. Therefore, in practice p1n and p2,n are often chosen such that

Appendix IV contains tables from which you can look up the values of the в’s in (5.20) and (5.22) for a x 100% = 5% and a x 100% = 10%. These percentages are called significance levels (see the next section).