# Asymptotic Properties of Extremum Estimetors

1. In the proof of Theorem 4.1.1, continuity of QT<ff) is used only to imply continu­ity of Q(0) and to make certain the measurability of §T. Therefore we can modify this theorem in such a way that we assume continuity of Q(6) but not of Q T(0) and define convergence in probability in a more general way that does not require measurability of §T. This is done in Theorem 9.6.1.

Also note that the proof of Theorem 4.1.1 can easily be modified to show that if the convergence in the sense of (i) holds in assumption C, converges to в0 almost surely.

2. Strictly speaking, (4.1.10) is defined only for those 7”sforwhich0r is nonempty. However, the probability that 0r is nonempty approaches 1 as T goes to infinity because of Assumption C of Theorem 4.1.2. As an aside, Jennrich (1969) proved that в* is a measurable function.

3. This proof is patterned after the proof of a similar theorem by Jennrich (1969).

4. Note that if zr is merely defined as a random variable with mean ц(в) and variance-covariance matrix 2(0), the minimization of the quadratic form does not even yield a consistent estimator of 0.

5. The term second-order efficiency is sometimes used to denote a related but differ­ent concept (see Pfanzagl, 1973).

6. If the error term is multiplicative as in Q = fixKhLheu, the log transformation reduces it to a linear regression model. See Bodkin and Klein (1967) for the estimation of both models.

7. The methods of proof used in this section and Section 4.3.3 are similar to those of Jennrich (1969).

8. Because a function continuous on a compact set is uniformly continuous, we can assume without loss of generality that fjifi) is uniformly continuous in Є N.

9. The nonsingularity is not needed here but is assumed as it will be needed later.

10. Note that in the special case where the constraint h(fi) = 0 is linear and can be written as Q’fi = c, (4.5.21) is similar to (4.3.32). We cannot unequivocally determine whether the chi-square approximation of the distribution of (4.5.21) is better or worse than the F approximation of the distribution of (4.3.32).

11. If the sample is (1, 2, 3), 2 is the unique median. If the sample is (1, 2, 3, 4), any point in the closed interval [2, 3] may be defined as a median. The definition (4.6.3) picks 2 as the median. If f(x) > 0 in the neighborhood of x — M, this ambiguity vanishes as the sample size approaches infinity.

12. The second term of the right-hand side of (4.6.10) does not affect the minimiza­tion but is added so that plim T~ 1ST can be evaluated without assuming the existence of the first absolute moment of Y,. This idea originates in Huber (1965).

13. Alternative methods of proof of asymptotic normality can be found in Bassett and Koenker (1978) and in Amemiya (1982a).