# General Parametric Heteroscedastidty

In this subsection we shall assume a] = gfa, fix) without specifying g, where fix is a subset (possibly whole) of the regression parameters fi and a is another vector of parameters unrelated to ft. In applications it is often assumed that glot, fii) == g(ot, The estimation of a and ft can be done in several steps.

In the first step, we obtain the LS estimator of fi, denoted ft. In the second step, a and can be estimated by minimizing — gfa, fii)]2, where fi, =

y, — x’,fi. The consistency and the asymptotic normality of the resulting esti­mators, denoted a and Д, have been proved by Jobson and^Fuller (1980). In the third step we have two main options: FGLS using gfa, fix) or MLE under normality using a and fi as the initial estimates in some iterative algorithm.6 Carroll and Ruppert (1982a) proved that under general assumptions FGLS has the same asymptotic distribution as GLS. Jobson and Fuller derived simple formulae for the method of scoring and proved that the estimator of fi obtained at the second iteration is asymptotically efficient (and is asymptoti­cally more efficient than GLS or FGLS). Carroll and Ruppert (1982b) have pointed out, however, that GLS or FGLS is more robust against a misspecifi – cation in the g function. Carroll and Ruppert (1982a) have proposed a robust version of FGLS (cf. Section 2.3) in an attempt to make it robust also against nonnormality.

There are situations where FGLS has the same asymptotic distribution as MLE. Amemiya (1973b), whose major goal lay in another area, compared the asymptotic efficiency of FGLS vis-i-vis MLE in cases where y, has mean xfi and variance ri2(x’,fi)2 and follows one of the three distributions: (1) normal, (2) lognormal, and (3) gamma. This is the form of heteroscedasticity suggested by Prais and Houthakker (1955). It was shown that FGLS is asymptotically efficient if y, has a gamma distribution. Thus this is an example of the BAN estimators mentioned in Section 4.2.4.

This last result is a special case of the following more general result attribut­able to Nelder and Wedderbum (1972). Let {yt) be independently distributed with the density

Ли) = exp {a(X)[g(e,)y, – Кв,) + у,)]), (6.5.17)

where ot and в, = q(x,,fi) are scalars and A, x,, and fi are vectors. It is assumed that К (в,) = в\$'(в,) for every t, which implies that Ey, = в, and Vy, = [a(A)£'(0f)]-1.Then, in the estimation of fi, the method of scoring is identical to the Gauss-Newton nonlinear weighted least squares iteration (cf. Section 4.4.3). The binomial distribution is a special case of (6.5.17). To see this, define the binary variable y, that takes unity with probability в, and zero with probability 1 — в, and put a(A) = 1, g(6,) = log в, — log (1 — в,), h(d,) = —log (1 — в,), and k(X, yt) — 0. The normal distribution is also a special case: Take у, – Щв„ X2), a(A) = A-2, g{dt) = в„ h(6,) = 62/2, and k(X, y,) = —2-,A2 log (2яА2) — yjf 2. The special case of Amemiya (1973b) is obtained by putting a(X) = – A, g(6t) = в^ Кв,) = — log 0„ and k(X, yt) = X~l log Г(А) – log A + A-*(l ~ A) log y,.

There are series of articles that discuss tests of homoscedasticity against heteroscedasticity of the form a2 = g(a, x’,fi), where a is a scalar such that g{0, x’,fi) does not depend on t. Thus the null hypothesis is stated as a = 0. Anscombe (1961) was the first to propose such a test, which was later modified by Bickel (1978). Hammerstrom (1981) proved that BickePs test is locally uniformly most powerful under normality.