The Multivariate Normal Distribution
Now let the components of X = (x,…, xn)T be independent, standard normally distributed random variables. Then, E(X) = 0(e Kn) and Var(X) = In. Moreover, the joint density f (x) = f (x,.xn )of X in this case is the product of the standard normal marginal densities:
f(x) = f(X1, …,xn) = П ——————-
_ exp (-2 ЕП=1xj) exp (-1XTx)
(fbn )n (fbn )n ‘
The shape of this density for the case n = 2 is displayed in Figure 5.1.
Next, consider the following linear transformations of X : Y = д + AX, where д = (д1,…, дп )T is a vector of constants and A is a nonsingular n x n matrix with nonrandom elements. Because A is nonsingular and therefore invertible, this transformation is a one-to-one mapping with inverse X = A-1 (Y – д). Then the density function g(y) of Yis equal to
f (x )|det(9 x/дy )| f (A-1 y – A-V)|det(d(A-1 y
f (A-1 y – A-V)|det(A-1)| =
exp [-2(y – д)T(A 1)TA 1(y – д)]
(V2n )n |det( A)|
exp [-2(y – д)T(AAT)-1(y – д)] (V2n )V |det( AAT)| .
Observe that д is the expectation vector of Y : E(Y) = д + A (E(X)) = д. But what is AAT? We know from (5.2) that Var(Y) = E [iYT] – дд’1′. Therefore, substituting Y = д + AX yields
Var(Y) = E[(д + AX)(^T + XTAT) – ддТ]
= д( E (XT)) AT + A(E ^д^ + A( E (XXT)) AT = AAT
because E(X) = 0 and E[XXT] = In. Thus, AAT is the variance matrix of Y. This argument gives rise to the following definition of the n-variate normal distribution:
Definition 5.1: Let Y be an n x 1 random vector satisfying E(Y) = д and Var(Y) = £, where £ is nonsingular. Then Y is distributed Nn (д, £) if the density g(y) ofY is of the form
In the same way as before we can show that a nonsingular (hence one-to-one) linear transformation of a normal distribution is normal itself:
Theorem 5.1: Let Z = a + BY, where Y is distributed Nn(д, £) and B is a nonsingular matrix of constants. Then Z is distributed Nn (a + B д, B£ BT).
Proof: First, observe that Z = a + BY implies Y = B-1(Z – a). Let h(z) be the density of Z and g(y) the density of Y. Then
h(z) = g(y)|det(9y/d z)|
= g(B-1 z – B-1a)|det(9(B-1 z – B-1a)/9z)|
_ g(B-1 z – B-1a) g(B-1(z – a))
exp [-2(B-1(z – a) – д)^-1(B-1(z – a) – д)]
exp [-2(z – a – B ifT(B£BT)-1(z – a – B д)]
= (V2n )V det( B £ B T) ‘
I will now relax the assumption in Theorem 5.1 that the matrix B is a nonsingular n x n matrix. This more general version of Theorem 5.1 can be proved using the moment-generating function or the characteristic function of the multivariate normal distribution.
Theorem 5.2: Let Y be distributed Nn(д, E). Then the moment-generating function of Y is m(t) = exp(tTд + tTEt/2), and the characteristic of Y is tp(t) = exp(i ■ tтд — tTEt/2).
Proof: We have
Because the last integral is equal to 1, the result for the moment-generating function follows. The result for the characteristic function follows from p(t) = m(i ■ t). Q. E.D.
Theorem 5.3: Theorem 5.1 holds for any linear transformation Z = a + BY.
Proof: Let Z = a + BY, where B is m x n. It is easy to verify that the characteristic functionof Z is vz(t) = E[exp(i ■ tTZ)] = E[exp(i ■ tT(a + BY))] =
exp(i ■ tTa)E[exp(i ■ tTBY)] = exp(i ■ (a + Bд)тt – 1 tTBEBTt). Theorem