# APPENDIX: DISTRIBUTION THEORY

 DEFINITION 1 (Chi-square Distribution) Let {ZJ, і = 1, 2, . . . , n, be i. i.d. as N(0, 1). Then the distribution of X”=1Z2 is called the chi-square 9 distribution, with n degrees of freedom and denoted by Xn • 2 2 THEOREM 1 IfX~xn and T ~ Xm and if X and Y are independent, then X + Y ~ xl+m ■ THEOREM 2 If X ~ xl > then EX = n and VX = 2n. THEOREM 3 Let {X,} be i. i.d. as iV(|a, cr2), і = 1, 2, . . . , n. Define Xn = n"1 SjLiXj. Then n X № – *n)2 i= 1 2 2 Xn—1 * CT

 Proof. Define Z* = (X* — |x)/a. Then Z,- ~ N(0, 1) and

But since (Z — Z2)/V2 ~ N{0, 1), the right-hand side of (2) is Xi by Definition 1. Therefore, the theorem is true for n = 2. Second, assume it is true for n and consider n + 1. We have

П+1 n

(3) X (Z* – Zn+l)2 = X (Zi – Zn)2 +

!=1 І=1

But the first and second terms of the right-hand side above are inde­pendent because

(4) E(Zi ~ Zn) (Z„+i – Z„) = 0.

Moreover, we can easily verify

(5) – M=(Zn+1- Zn)~ N(0,1),

Уп + 1

which implies by Definition 1 that the square of (5) is Xi- Therefore, by Theorem 1, the left-hand side of (3) is Xn■ ^

THEOREM 4 Let {Xj} and Xn be as defined in Theorem 3. Then

DEFINITION 2 (Student’s t Distribution) Let Y be N(0, 1) and inde­pendent of a chi-square variable n■ Then the distribution of VwF/VxiTis called the Student’s t distribution with n degrees of freedom. We shall denote this distribution by tn.

theorem 5 Let {XJ be i. i.d. as N(0, 1), і = 1, 2, . . . , n. Define X = n^^Xi and S2 = п_1Еги=1(Х; – X)2. Then

(X ~ p)Vn – 1

S *n-1′

— _ і 2

Proof. Since X ~ N(Ju, n <j ), we have (X –

У =——————

Also, we have

(7) Cov[(Xi – X), X] = E(Xi – X)X = EXiX – EX2

Since X,- — X and X are jointly normal, (7) implies that these two terms are independent by virtue of Theorem 5.3.4. Therefore, by Theorem 3.5.1,

о

(8) S and Y are independent.

But we have by Theorem 3

Z № – *n?

і — 1 9

(9) ;———— Xn-i-

(T

Therefore, the theorem follows from (6), (8), and (9) because of Defini­tion 2. □

theorem 6 Let {XJ be i. i.d. as N((xx, (тх), і = 1, 2, . . . , nx and let {T,} be i. i.d. as X(py, oy), і = 1, 2, . . . , nY. Assume that {X;} are independent of {TJ. Let X and Y be the sample means and Sx and SY be the sample variances. Then if = cry,

Proof. We have

(X — Y) — (|xx – |xy)

f 2 2ЛІ/2

+ °Y

and

Z (Лі – xf Z (Xi – Y)2

where (11) follows from Theorems 1 and 3. Moreover, (10) and (11) are independent for the same reason that (8) holds. Therefore, by Definition 2,

Finally, the theorem follows from inserting a = cry into (12). □

DEFINITION 3 (F Distribution) If X ~ n and Y ~ Xm and if X and Y

are independent, then (X/ri) / (Y/m) is distributed as F with n and m degrees of freedom and denoted F(n, m). This is known as the F distri­bution. Here n is called the numerator degrees of freedom, and m the denominator degrees of freedom.

[1] 13

(4.2.3) EX = — (1 +4 + 9+ 16 + 25 + 36) = — • 21 3

Next, using Theorem 4.1.1,

(4.2.4) EX2 = ~ (1 + 8 + 27 + 64 + 125 + 216) = 21.

Therefore, by Definition 4.2.1,

[2] 49 1

.’. Cov(X, Y) = — — —— = — —— by Definition 4.3.1.

[4] 144 144

[5] 2 US 9 nS

(8.2.23) —- < a < —

h2 k

where k and are chosen so as to satisfy

(8.2.24) P(kі < x»-i < h) = T

This example differs from the examples we have considered so far in that

(8.2.24) does not determine k and uniquely. In practice it is customary to determine these two values so as to satisfy

2о2 2X2

[7] 1

(8.3.20) – —- Yj (— MO2 – —2 (M – – Po)2

Note that (9.7.5) is reduced to (9.7.2) if aJ2 = 0 and, further, to (9.7.1)

■r 2 _ 2

if СГ] — cr2.

An additional justification of the test (9.7.4) is provided by the fact that it is a likelihood ratio test. To see this, note that by (5.4.1) and (9.4.5),

[9] (Section 9.3)

An estimator T of a parameter p is distributed as A(p, 4), and we want to test H0: (jl = 25 against Hx: p = 30. Assuming that the prior probabilities of H0 and Hx are equal and the costs of the Type I and II errors are equal, find the Bayesian optimal critical region.

[10] (Section 9.3)

Let X ~ N(p, 16). We want to test H0: p = 2 against Hx: p = 3 on the basis of four independent observations on X. Suppose the loss matrix is given by

[11] 2

Since {ut) are i. i.d. with mean <r, we have by the law of large numbers (Theorem 6.2.1)

Zuf о

(10.2.67) plim = <t.

Equation (10.2.43) and the Chebyshev’s inequality (6.1.2) imply

(10.2.68) plim Чщ ~ uf = 0.

Therefore the consistency of d2 follows from (10.2.66), (10.2.67), and

(10.2.68) because of Theorem 6.1.3.

We shall prove the asymptotic normality of a and 0. From (10.2.19) and (10.2.21) we note that both (3 — (3 and a — a can be written in expressions of the form

(10.2.69)

where (z() is a certain sequence of constants. Since the variance of

[18]

(10.2.69) goes to zero if Zzt goes to infinity, we transform (10.2.69) so that

[19] (Section 11.2)

Prove Theorem 11.2.1.

[20] (Section 11.3)

Using Theorem 11.3.3, prove its corollary obtained by deleting the word “adjacent” from the theorem.

EXERCISES

[22] (Section 11.3)

Verify |AB| = |A| |В|, where

[23] (Section 11.5)

Find the characteristic vectors and roots of

[24] 2, . . . , T are known constants; {ut are unobservable random variables which are i. i.d. with Eut = 0 and Vut = a ; and p1; p2, • • ■ > Px and c are unknown parameters that we wish to estimate. We shall state the assump­tion on {x(!} later, after we rewrite (12.1.1) in matrix notation. The linear regression model with these assumptions is called the classical regression model.