Recent Developments in Regression Analysis
1. A much more detailed account of this topic can be found in Amemiya (1980a).
2. We use the term estimator here, but all the definitions and the results of this subsection remain the same if we interpret d as any decision function mapping Y into 9.
3. We assume that the losses do not depend on the parameters of the models. Otherwise, the choice of models and the estimation of the parameters cannot be sequentially analyzed, which would immensely complicate the problem. However, we do not claim that our assumption is realistic. We adopt this simplification for the purpose of illustrating certain basic ideas.
4. For ways to get around this problem, see, for example, Akaike (1976) and Schwartz (1978).
5. See Thisted (1976) for an excellent survey of this topic, as well as for some original results. More recent surveys are given by Draper and Van Nostrand (1979) and Judge and Bock (1983).
6. The matrix Н, Л7 ‘HJ is sometimes referred to as the Moore-Penrose generalized inverse of X’X and is denoted by (X’X)+. See Rao (1973, p. 24) for more discussion of generalized inverses.
7. This question was originally posed and solved by Silvey (1969).
8. Although Sclove considers the random coefficients model and the prediction (rather than estimation) of the regression vector, there is no essential difference between his approach and the Bayesian estimation.
9. What follows simplifies the derivation of Sclove et al. (1972).
10. See Section 4.6.1. In Chapter 3 we shall discuss large sample theory and make the meaning of the term asymptotically more precise. For the time being, the reader should simply interpret it to mean “approximately when T is large.”
11. See the definition of probability limit in Chapter 3. Loosely speaking, the statement means that when T is large, s is close to s0 with probability close to 1.