# APPENDIX: DISTRIBUTION THEORY

 DEFINITION 1 (Chi-square Distribution) Let {ZJ, і = 1, 2, . . . , n, be i. i.d. as N(0, 1). Then the distribution of X”=1Z2 is called the chi-square 9 distribution, with n degrees of freedom and denoted by Xn • 2 2 THEOREM 1 IfX~xn and T ~ Xm and if X and Y are independent, then X + Y ~ xl+m ■ THEOREM 2 If X ~ xl > then EX = n and VX = 2n. THEOREM 3 Let {X,} be i. i.d. as iV(|a, cr2), і = 1, 2, . . . , n. Define Xn = n"1 SjLiXj. Then n X № – *n)2 i= 1 2 2 Xn—1 * CT

 Proof. Define Z* = (X* — |x)/a. Then Z,- ~ N(0, 1) and But since (Z — Z2)/V2 ~ N{0, 1), the right-hand side of (2) is Xi by Definition 1. Therefore, the theorem is true for n = 2. Second, assume it is true for n and consider n + 1. We have

П+1 n (3) X (Z* – Zn+l)2 = X (Zi – Zn)2 +

!=1 І=1

But the first and second terms of the right-hand side above are inde­pendent because

(4) E(Zi ~ Zn) (Z„+i – Z„) = 0.

Moreover, we can easily verify

(5) – M=(Zn+1- Zn)~ N(0,1),

Уп + 1

which implies by Definition 1 that the square of (5) is Xi- Therefore, by Theorem 1, the left-hand side of (3) is Xn■ ^ THEOREM 4 Let {Xj} and Xn be as defined in Theorem 3. Then

DEFINITION 2 (Student’s t Distribution) Let Y be N(0, 1) and inde­pendent of a chi-square variable n■ Then the distribution of VwF/VxiTis called the Student’s t distribution with n degrees of freedom. We shall denote this distribution by tn.

theorem 5 Let {XJ be i. i.d. as N(0, 1), і = 1, 2, . . . , n. Define X = n^^Xi and S2 = п_1Еги=1(Х; – X)2. Then

(X ~ p)Vn – 1

S *n-1′

— _ і 2  Proof. Since X ~ N(Ju, n <j ), we have (X – У =——————

Also, we have  (7) Cov[(Xi – X), X] = E(Xi – X)X = EXiX – EX2

Since X,- — X and X are jointly normal, (7) implies that these two terms are independent by virtue of Theorem 5.3.4. Therefore, by Theorem 3.5.1,

о

(8) S and Y are independent.

But we have by Theorem 3

Z № – *n?

і — 1 9

(9) ;———— Xn-i-

(T

Therefore, the theorem follows from (6), (8), and (9) because of Defini­tion 2. □    theorem 6 Let {XJ be i. i.d. as N((xx, (тх), і = 1, 2, . . . , nx and let {T,} be i. i.d. as X(py, oy), і = 1, 2, . . . , nY. Assume that {X;} are independent of {TJ. Let X and Y be the sample means and Sx and SY be the sample variances. Then if = cry,

Proof. We have

(X — Y) — (|xx – |xy) f 2 2ЛІ/2

+ °Y and    Z (Лі – xf Z (Xi – Y)2   where (11) follows from Theorems 1 and 3. Moreover, (10) and (11) are independent for the same reason that (8) holds. Therefore, by Definition 2,

Finally, the theorem follows from inserting a = cry into (12). □

DEFINITION 3 (F Distribution) If X ~ n and Y ~ Xm and if X and Y

are independent, then (X/ri) / (Y/m) is distributed as F with n and m degrees of freedom and denoted F(n, m). This is known as the F distri­bution. Here n is called the numerator degrees of freedom, and m the denominator degrees of freedom. 13

(4.2.3) EX = — (1 +4 + 9+ 16 + 25 + 36) = — • 21 3

Next, using Theorem 4.1.1,

(4.2.4) EX2 = ~ (1 + 8 + 27 + 64 + 125 + 216) = 21.

Therefore, by Definition 4.2.1,

 49 1

.’. Cov(X, Y) = — — —— = — —— by Definition 4.3.1.

 144 144

 2 US 9 nS

(8.2.23) —- < a < —

h2 k

where k and are chosen so as to satisfy

(8.2.24) P(kі < x»-i < h) = T

This example differs from the examples we have considered so far in that

(8.2.24) does not determine k and uniquely. In practice it is customary to determine these two values so as to satisfy

2о2 2X2

 1

(8.3.20) – —- Yj (— MO2 – —2 (M – – Po)2

Note that (9.7.5) is reduced to (9.7.2) if aJ2 = 0 and, further, to (9.7.1)

■r 2 _ 2

if СГ] — cr2.

An additional justification of the test (9.7.4) is provided by the fact that it is a likelihood ratio test. To see this, note that by (5.4.1) and (9.4.5),

 (Section 9.3)

An estimator T of a parameter p is distributed as A(p, 4), and we want to test H0: (jl = 25 against Hx: p = 30. Assuming that the prior probabilities of H0 and Hx are equal and the costs of the Type I and II errors are equal, find the Bayesian optimal critical region.

 (Section 9.3)

Let X ~ N(p, 16). We want to test H0: p = 2 against Hx: p = 3 on the basis of four independent observations on X. Suppose the loss matrix is given by

 2

Since {ut) are i. i.d. with mean <r, we have by the law of large numbers (Theorem 6.2.1)

Zuf о

(10.2.67) plim = <t.

Equation (10.2.43) and the Chebyshev’s inequality (6.1.2) imply

(10.2.68) plim Чщ ~ uf = 0.

Therefore the consistency of d2 follows from (10.2.66), (10.2.67), and

(10.2.68) because of Theorem 6.1.3.

We shall prove the asymptotic normality of a and 0. From (10.2.19) and (10.2.21) we note that both (3 — (3 and a — a can be written in expressions of the form

(10.2.69)

where (z() is a certain sequence of constants. Since the variance of



(10.2.69) goes to zero if Zzt goes to infinity, we transform (10.2.69) so that

 (Section 11.2)

Prove Theorem 11.2.1.

 (Section 11.3)

Using Theorem 11.3.3, prove its corollary obtained by deleting the word “adjacent” from the theorem.

EXERCISES

 (Section 11.3)

Verify |AB| = |A| |В|, where

 (Section 11.5)

Find the characteristic vectors and roots of

 2, . . . , T are known constants; {ut are unobservable random variables which are i. i.d. with Eut = 0 and Vut = a ; and p1; p2, • • ■ > Px and c are unknown parameters that we wish to estimate. We shall state the assump­tion on {x(!} later, after we rewrite (12.1.1) in matrix notation. The linear regression model with these assumptions is called the classical regression model.