# Category Introduction to the Mathematical and Statistical Foundations of Econometrics

## Eigenvectors

By Definition I.21 it follows that if M is an eigenvalue of an n x n matrix A, then A — MIn is a singular matrix (possibly complex valued!). Suppose first that M is real valued. Because the rows of A — MIn are linear dependent there exists a vector x є Kn such that (A — MIn)x — 0 (є Rn); hence, Ax — Mx. Such a

vector x is called an eigenvector of A corresponding to the eigenvalue M. Thus, in the real eigenvalue case:

Definition I.22: An eigenvector13 of an n x n matrix A corresponding to an eigenvalue M is a vector x such that Ax — Mx.

However, this definition also applies to the complex eigenvalue case, but then the eigenvector x has complex-valued components: x є Cn. To show the latter, consider the case that M is complex valued: M — a + i ■ в, а, в є К, в — 0...

## The Multivariate Normal Distribution

Now let the components of X = (x,…, xn)T be independent, standard nor­mally distributed random variables. Then, E(X) = 0(e Kn) and Var(X) = In. Moreover, the joint density f (x) = f (x,.xn )of X in this case is the product of the standard normal marginal densities:

A exp(-x2

f(x) = f(X1, …,xn) = П ——————-

j=1 jln

_ exp (-2 ЕП=1xj) exp (-1XTx)

(fbn )n (fbn )n ‘

The shape of this density for the case n = 2 is displayed in Figure 5.1.

Next, consider the following linear transformations of X : Y = д + AX, where д = (д1,…, дп )T is a vector of constants and A is a nonsingu­lar n x n matrix with nonrandom elements. Because A is nonsingular and therefore invertible, this transformation is a one-to-one mapping with inverse X = A-1 (Y – д). Then the density function g(y) of Yis equal to

## Linear Regression with Normal Errors

Let Zj = (Yj, Xj )T, j = 1,…,n be independent random vectors such that

Yj = ao + eTXj + Uj, Uj Xj ~ N(0, ao2),

where the latter means that the conditional distribution of Uj, given Xj, is a normal N(0, a^) distribution. The conditional density of Yj, given

Xj, is

exp [ – 2(У – ao – Aj’X)2/-c2]

а0л/2п
where 0o = (ao, fie, ao2)T-

Next, suppose that the Xj’s are absolutely continuously distributed with density g(x). Then the likelihood function is

Ln(в) = (П f(Yje, Xj)ЦЦ g(Xj)

where в = (a, вT, q2)T. However, note that in this case the marginal distribution of Xj does not matter for the ML estimator в because this distribution does not depend on the parameter vector в0...

## Appendix III – A Brief Review of Complex Analysis

III.1. The Complex Number System

Complex numbers have many applications. The complex number system allows computations to be conducted that would be impossible to perform in the real world. In probability and statistics we mainly use complex numbers in dealing with characteristic functions, but in time series analysis complex analysis plays a key role. See for example Fuller (1996).

Complex numbers are actually two-dimensional vectors endowed with arith­metic operations that make them act as numbers. Therefore, complex numbers are introduced here in their “real” form as vectors in K2.

In addition to the usual addition and scalar multiplication operators on the elements of K2 (see Appendix I), we define the vector multiplication operator

(III1)

Observe that

Moreover,...

## Liapounov’s Inequality

Liapounov’s inequality follows from Holder’s inequality (2.22) by replacing Y with 1:

E(|X|) < (E(|X |p))1/p, where p > 1.

2.6.3. Minkowski’s Inequality

If for some p > 1, E[|X|p] < to and E [| Y|p] < to, then

E(|X + Y|) < (E(|X |p))1/p + (E(| Y |p))1/p. (2.23)

This inequality is due to Minkowski. For p = 1 the result is trivial. There­fore, let p > 1. First note that E[|X + Y|p] < E[(2 ■ max(|X|, |Y|))p] = 2pE[max(|X|p, |Y|p)] < 2pE[|X|p + |Y|p] < to; hence, we may apply Liapounov’s inequality:

E(|X + Y|) < (E(|X + Y|p))1/p. (2.24)

Next, observe that

E(|X + Y|p) = E(|X + Y|p-1|X + Y|) < E(|X + Y|p-1|X|)

+ E (| X + Y | p-1|Y |). (2.25)

Let q = p/(p — 1). Because 1/q + 1/p = 1 it follows from Holder’s inequality that

E(|X + Y|p—1|X|) < (E(|X + Y|(p—1)q))1/q(E(|X|p))1/p

< (E

## Modes of Convergence

5.3. Introduction

Toss a fair coin n times, and let Yj = 1 if the outcome of the j th tossing is heads and Yj = — 1 if the outcome involved is tails. Denote Xn = (1/n)Yl j= Yj. For the case n = 10, the left panel of Figure 6.1 displays the distribution function Fn(x)1 of Xn on the interval [—1.5, 1.5], and the right panel displays a typical plot of Xk for k = 1, 2,…, 10 based on simulated Yj’s 2

Now let us see what happens if we increase n: First, consider the case n = 100 in Figure 6.2. The distribution function Fn(x) becomes steeper for x close to zero, and Xn seems to tend towards zero.

These phenomena are even more apparent for the case n = 1000 in Figure

6.3.

What you see in Figures 6.1—6...