# Classical Least Squares Theory

1. This statement is true only when xf is a scalar. If y, x,, and x2 are scalar dichotomous random variables, we have E{y|x,, x2) = /?0 + fiixi + fiixi + Psxix2, where the 0’s are appropriately defined.

2. The fi that appears in (1.2.1) denotes the domain of the function S and hence, strictly speaking, should be distinguished from the parameter ft, which is unknown and yet can be regarded as fixed in value. To be precise, therefore, we should use two different symbols; however, we shall use the same symbol to avoid complicating the notation, so the reader must judge from the context which of the two meanings the symbol conveys.

3. Again, the warning of note 2 is in order here. The fi and a1 that appear in the likelihood function represent the domain of the function whereas those that appear in Model 1 are unknown fixed values of the parameters—so-called true values.

4. The Cramer-Rao lower bound is discussed, for example, by Cox and Hinkley (1974), Bickel and Doksum (1977), Rao (1973), and Zacks( 1971), roughly in ascending order of sophistication.

3. Good references for Bayesian statistics are Zellner (1971) and Box and Tiao (1973). For a quick review, Lindley (1972) is useful.

6. Even in this situation some people prefer using the t test. The procedure is also asymptotically correct, but there is no theoretical reason for the t test to be superior to the standard normal test in nonnormal cases. The same remark applies to the Ftest discussed later. See Pearson and Please (1975) for a discussion of the properties of the t and F tests in nonnormal cases. See White and MacDonald (1980) for tests of nonnormality using the least squares residuals.

7. Chow (1960) considered the case where T2<K* and indicated how the subsequent analysis can be modified to incorporate this case.

8. This formula can easily be generalized to the situation where there are n regimes (n > 2). Simply combine n equations like (1.3.25) and (1.5.27) and calculate the sum of squared residuals from each equation. Then the numerator chi-square has (n — 1)F* degrees of freedom and the denominator chi-square has 2?_, T, — nK* degrees of freedom.

9. These two tests can easily be generalized to the case of n regimes mentioned in note 8.

10. If we have cross-section data, p refers to a cross-section unit not included in the sample.

11. Theorem 1.6.1 is true even if we let C depend on p. Then the two theorems are equivalent because any d satisfying (1.6.5) can be written as Cx„ for some C such that C’X = I.

## Leave a reply