# MULTIVARIATE NORMAL RANDOM VARIABLES

In this section we present results on multivariate normal variables in matrix notation. The student unfamiliar with matrix analysis should read Chapter 11 before this section. The results of this section will not be used directly until Section 9.7 and Chapters 12 and 13.

Let x be an и-dimensional column vector with Ex = x and Ух = X. (Throughout this section, a matrix is denoted by a boldface capital letter and a vector by a boldface lowercase letter.) We write their elements explicidy as follows:

Note that <jt] = Cov(x„ xA, i, j = 1,2,…, n, and, in particular, сти = Vxt,

J о

і = 1, 2, . . . , n. We sometimes write ct* for сгй.

DEFINITION 5.4.1 We say x is multivariate normal with mean (jl and variance-covariance matrix X, denoted 1V(|A, X), if its density is given by  I (X – mO’X j(x – |X) .

The reader should verify that in the case of n = 2, the above density is reduced to the bivariate density (5.3.1).

Now we state without proof generalizations of Theorems 5.3.1, 5.3.2, and 5.3.4.

THEOREM 5.4.1 Let x ~ N(|x, X) and partition x’ = (y’, z’), where у is /г-dimensional and z is ^-dimensional such that h + k = n. Partition X conformably as

 2ц Ї12 X2j X22

where Xn = Vy = E_{y – Ey){y – Ey)’], X22 = Vz = E[(z – £z)(z – Ez)’], X12 = £[(y ~ Ey) (z — Ez)’], and X2i = (Xi2)’. Then any subvector of x, such as у or z, is multivariate normal, and the conditional distribution of у given z (similarly for z given y) is multivariate normal with

E(y I z) = Ey + X12(X22) z – Ez)

and

V(y I z) = Xu – Xl2(X22) JX21.

THEOREM 5.4.2 Let x ~ X) and let A be an m X n matrix of

constants such that m < n and the rows of A are linearly independent. Then Ax ~ 1V(A|X, AXA’).

theorem 5.4.3 Suppose x ~ А(|л, X) and let у and z be defined as in Theorem 5.4.1. If X]2 = 0, у and z are independent. That is to say,/(x) = /(y)/(z), where /(y) and /(z) are the multivariate densities of у and z, respectively.

EXERCISES

1. (Section 5.1)

Five fair dice are rolled once. Let X be the number of aces that turn up. Compute EX, VX, and P(X > 4).

2. (Section 5.2)

Suppose X, Y, and W are mutually independent and distributed as X ~ N(l, 4), Y ~ N(2, 9),W ~ 5(1, 0.5). Calculate P(U < 5) where U = WX + (1 — W)Y.

3. (Section 5.2)

Let X = S and Y = T + TS2, where S and T are independent and distributed as N(0, 1) andlV(l, 1), respectively. Find the best predic­tor and the best linear predictor of Y given X and calculate their respective mean squared prediction errors.

4. (Section 5.3)

Suppose U and V are independent and each is distributed as N{0, 1). Define X and Y by

Y = X – 1 – U,

X = 2Y – 3 – V.

Obtain E(Y I X) and V(Y | X).

5. (Section 5.3)

Let (Хг, be i. i.d. (independent and identically distributed) draw­ings from bivariate normal random variables with EX = 1, EY = 2, VX = 4, VY = 9, and Cov(X, Y) = 2.75. Define X = zf^Xi/36 and f = zf=iF;/36. Calculate P(Y > 3 – 2X).

6. (Section 5.3)

Suppose (X, Y) ~ BN(0, 0, 1, 1, p), meaning that X and Y are bivari­ate normal with zero means and unit variances and correlation p. Find the best predictor and the best linear predictor of Y given X and find their respective mean squared prediction errors.

We have already alluded to results in large sample theory without stating them in exact terms. In Chapter 1 we mentioned that the empirical frequency r/n, where r is the number of heads in n tosses of a coin, converges to the probability of heads; in Chapter 4, that a sample mean converges to the population mean; and in Chapter 5, that the binomial variable is approximately distributed as a normal variable. The first two are examples of a law of large numbers, and the third, an example of a central limit theorem. In this chapter we shall make the notions of these convergences more precise. Most of the theorems will be stated without proofs. For the proofs the reader should consult, for example, Rao (1973), Chung (1974), Serfling (1980), or Amemiya (1985).