# Box-Cox Transformation  Box and Cox (1964) proposed the model3
z,(A) = x# + u„

where, forj>, > 0,

— 1

Z,(A)== )T if ХФ0

= log y, if A = 0.

Note that because lim^oCy? — 1 )/A = log y,, z,(A) is continuous at A = 0. It is assumed that (и,) are i. i.d. with Eu, = 0 and Vu, = a2.

The transformation z,(A) is attractive because it contains y, and log y, as special cases, and therefore the choice between y, and log y, can be made within the framework of classical statistical inference.

Box and Cox proposed estimating A, fi, and a1 by the method of maximum likelihood assuming the normality of ut. However, u, cannot be normally distributed unless A = 0 because z,(A) is subject to the following bounds:

z,(A)S-I if A>0 (8.1.14)

if A <0.

A

Later we shall discuss their implications on the properties of the Box – Cox MLE.

This topic is relevant here because we can apply NL2S to this model with only a slight modification. To accommodate a model such as (8.1.12), we generalize (8.1.1) as

f{y„„Xu, d) = ut. (8.1.15)

Then all the results of the previous section are valid with only an apparent modification. The minimand (8.1.2), which defines the class of NL2S estima­tors, should now be written as

f’WfW’Wr’W’f, (8.1.16)

and the conclusions of Theorem 8.1.1 and 8.1.2 remain intact.

Define the NL2S estimator (actually, a class of estimators) of a = (A, fi’)’, denoted a, in the model (8.1.12) as the value of a that minimizes

[z(A> – X)?]’W(W’W)_1W'[z(A) – Xfi], (8.1.17)

where z(A) is the vector the tth element of which is z,(A). The discussion about the choice of W given in the preceding section applies here as well. One practical choice of W would be to use the x’s and their powers and cross products. Using arguments similar to the ones given in the preceding section, we can prove the consistency and the asymptotic normality of the estimator. The asymptotic covariance matrix of – if (Jot — a) is given by

 " dz’ ‘ г F(a) = a2 plim T dA W(W, W)-‘W’ – X -X’

The Box-Cox maximum likelihood estimator is defined as the pseudo MLE obtained under the assumption that {u,} are normally distributed. In other words, the Box-Cox MLE в of в = (A, ff, a2)’ maximizes

log L = —^log a2 – 2j (z, – Д’х,)2 + (Я — 1) (8.1.19)

Because и, cannot in reality be normally distributed, (8.1.19) should be re­garded simply as a maximand that defines an extremum estimator rather than a proper logarithmic likelihood function. This fact has two implications:

1. We cannot appeal to the theorem about the consistency of the maximum likelihood estimator, instead, we must evaluate the probability limit as the value of the parameter at which lim T~l E logL is maximized.

2. The asymptotic covariance matrix of 4t (0 — в) is not equal to the usual expression —lim TIE61 log Ь/двдв’]~1 but is given by d log L d log L Г З2 log L 1

дв дв’ і dm J ’

(8.1.20)

where the derivatives are evaluated at plim в.

Thus, to derive the probability limit and the asymptotic covariance matrix of в, we must evaluate E log L and the expectations that appear in (8.1.20) by using the true distribution of ut.

Various authors have studied the properties of the Box-Cox MLE under various assumptions about the true distribution of u,. Draper and Cox (1969) derived an asymptotic expansion for E d log L/dX in the model without re­gressors and showed that it is nearly equal to 0 (indicating the approximate consistency of the Box-Cox MLE X) if the skewness of z,(X) is small. Zarembka (1974) used a similar analysis and showed E d log L/dX Ф 0 if {u,} are hetero – scedastic. Hinkley (1975) proposed for the model without regressors an inter­esting simple estimator of X, which does notjequire the normality assump­tion; he compared it with the Box-Cox MLE X under the assumption that y, is truncated normal (this assumption was also suggested by Poirier,^ 1978}. Hinkley showed by a Monte Carlo study that the Box-Cox MLE X and fi perform generally well under this assumption. Amemiya and Powell (1981) showed analytically that if y, is truncated normal, plim X < X if X > 0 and plim X > X if X < 0.

Amemiya and Powell (1981) also considered the case where y, follows a two-parameter gamma distribution. This assumption actually alters the model as it introduces a natural heteroscedasticity in {и,}. In this case, the formula (8.1.18) is no longer valid and the correct formula is given by V{a) = plim 7’(Z’PWrZ)-1Z/P».DPjrZ(Z, Pw, Z)-1,

where Z = (дг/дк, —X) and D = Em’. Amemiya and Powell considered a simple regression model with one independent variable and showed that NL2S dominates the Box-Cox MLE for most of the plausible parameter values considered not only because the inconsistency of the Box-Cox MLE is significant but also because its asymptotic variance is much laiger.

There have been numerous econometric applications of the Box-Cox MLE. In the first of such studies, Zarembka (1968) applied the transformation to the study of demand for money. Zarembka’s work was followed up by studies by White (1972) and Spitzer (1976). For applications in other areas of economics, see articles by Heckman and Polachek (1974) (earnings-schooling relation­ship); by Benus, Kmenta, and Shapiro (1976) (food expenditure analysis); and by Ehrlich (1977) (crime rate).

All of these studies use the Box-Cox MLE and none uses NL2S. In most of these studies, the asymptotic covariance matrix for)? is obtained as if z,(A) were the dependent variable of the regression—a practice that can lead to gross errors. In many of the preceding studies, independent variables as well as the dependent variable are transformed by the Box-Cox transformation but possi­bly with different parameters.