# Linear Regression with Normal Errors

Let Zj = (Yj, Xj )T, j = 1,…,n be independent random vectors such that

Yj = ao + eTXj + Uj, Uj Xj ~ N(0, ao2),

where the latter means that the conditional distribution of Uj, given Xj, is a normal N(0, a^) distribution. The conditional density of Yj, given

Xj, is

exp [ – 2(У – ao – Aj’X)2/-c2]

а0л/2п
where 0o = (ao, fie, ao2)T-

Next, suppose that the Xj’s are absolutely continuously distributed with density g(x). Then the likelihood function is

Ln(в) = (П f(Yje, Xj)ЦЦ g(Xj)

where в = (a, вT, q2)T. However, note that in this case the marginal distribution of Xj does not matter for the ML estimator в because this distribution does not depend on the parameter vector в0. More precisely, the functional form of the ML estimator в as a function of the data is invariant to the marginal distributions of the Xj’s, although the asymptotic properties of the ML estimator (implicitly) depend on the distributions of the Xj’s. Therefore, without loss of generality, we may ignore the distribution of the Xj’s in (8.12) and work with the conditional likelihood function:

where в = (a, вT, a2)T

As to the a-algebras involved, we may choose &o = a({Xj}"=) and, for n > 1, &n = a ({Yj }n=1) v &o, where v denotes the operation “take the small­est a-algebra containing the two a-algebras involved.”2 The conditions (b) in Definition 8.1 then read

E[L 1(в)/L^(вo)&o] = E[f(Y^, X^/flY^o, X1)X1] = 1,

Thus, Definition 8.2 applies. Moreover, it is easy to verify that the conditions (c) of Definition 8.1 now read as P[f (Yn в1, Xn) = f (Ynв2, Xn)Xn] < 1 if в1 = в2. This is true but is tedious to verify.

Recall from Chapter 1 that the union of a-algebras is not necessarily a a-algebra.

8.2.2. Probit and Logit Models

Again, let Zj = (Yj, Xj)T, j = 1,…,n be independent random vectors, but now Yj takes only two values, 0 and 1, with conditional Bernoulli probabilities

P(Yj = 1|00, Xj) = F(«0 + ATXj),

P (Yj = 0|00, Xj) = 1 – F (a0 + AT Xj), (8.14)

where F is a given distribution function and 00 = (a0, AT)T For example, let the sample be a survey of households, where Yj indicates home ownership and Xj is a vector of household characteristics such as marital status, number of children living at home, and income.

If F is the logistic distribution function, F(x) = 1/[1 + exp(-x)], then model (8.14) is called the Logit model; if F is the distribution function of the standard normal distribution, then model (8.14) is called the Probit model. In this case the conditional likelihood function is

n

LП(0) = П -YjF^ + ATXj) + (1 – Yj)(1 – F(a + ATXj))],

j=1

where 0 = (a, AT)T – (8.15)

Also in this case, the marginal distribution of Xj does not affect the functional form of the ML estimator as a function of the data.

The a-algebras involved are the same as in the regression case, namely, = a ({Xj }°=j)and, for n > 1, .‘Xn = a ({Yj }n=j) v ^0. Moreover, note that

1

E[L 1(0)/L 1(00)|^0] = Y, [yF{a + ATX)

y=0

+ (1 – y)(1 – F(a + ATX1))] = 1,

and similarly

+ (1 – y)(1 – F(a + ATX^] = 1;

hence, the conditions (b) of Definition 8.1 and the conditions of Definition

8.2 apply. Also the conditions (c) in Definition 8.1 apply, but again it is rather tedious to verify this.