# The Cox procedure

This procedure focuses on the loglikelihood ratio statistic, and in the case of the above regression models is given by (using the notations of Section 2):

 T ln 2 zT 2 1J
 LRfg — Lf (Pt) – Lg(TT) —

where

a T = Т-Ц ef, a T = (X’X)-1X’y,

ef = y – XaT = Mxy, Mx = IT – X(X’X)-1X’, (13.26)

and

mT = T-1e’gegr St = (Z’Z)-1Z’y,

eg = y – zSt = Mzy, Mz = IT – Z(Z’Z)-1Z’. (13.27)

 v®*(9o)y

In the general case where the regression models are nonnested the average loglikelihood ratio statistic, ^ ln(a Т/ m T), does not converge to zero even if T is sufficiently large. For example, under Hf we have

and under Hg:

1

PT™(T ~lLRfgIHg) = ^2ln

The LR statistic is naturally centered at zero if one or the other of the above probability limits is equal to zero; namely if either Xf = 0 or Xg = 0.14 When Xf = 0 then X C Z and Hf is nested in Hg. Alternatively, if Xg = 0, then Z С X and Hg is nested in Hf. Finally, if both Xf = 0 and X g = 0 then the two regression models are observationally equivalent. In the nonnested case where both Xf Ф 0 or Xg Ф 0, the standard LR statistic will not be applicable and needs to be properly centered. Cox’s contribution was to note that this problem can be overcome if a consistent estimate of Plim^„(T^LRg | Hf), which we denote by Ef (T~4LRfg), is subtracted from T^1LRfg, which yields the new centered (modified) loglikelihood ratio stat­istic (also known as the Cox statistic) for testing Hf against Hg:

Sfg = T-lLRfg – Ef (T-lLRfg)

It is now clear that by construction the Cox statistic, Sfg, has asymptotically mean zero under Hf. As was pointed out earlier, since there is no natural null hypoth­esis in this setup, one also needs to consider the modified loglikelihood ratio statistic for testing Hg against Hf which is given by

wj + pj XgPT
d2

Both of these test statistics (when appropriately normalized by *Jt ) are asymp­totically normally distributed under their respective nulls with a zero mean and finite variances. For the test of Hf against Hg we have15

The associated standardized Cox statistic is given by

By reversing the role of the null and the alternative hypothesis a similar stand­ardized Cox statistic can be computed for testing Hg against Hf, which we denote by Ngf. Denote the (1 – a) percent critical value of the standard normal distribu­tion by Ca, then four outcomes are possible:

1. Reject Hg but not Hf if |Nfg| < Ca and |Ngf > Ca,

2. Reject Hf but not Hg if |Nfg| > Ca and |Ngf| < Ca,

3. Reject both Hf and Hg if | Nfg | > Ca and | Ngf | > Ca,

4. Reject neither Hf nor Hg if | Nfg | < Ca and | Ngf | < Ca.

These are to be contrasted to the outcomes of the nested hypothesis testing where the null is either rejected or not, which stem from the fact that when the hypoth­eses under consideration are nonnested there is no natural null (or maintained) hypothesis and one therefore needs to consider in turn each of the hypotheses as the null. So there are twice as many possibilities as there are when the hypoth­eses are nested. Note that if we utilize the information in the direction of rejec­tion, that is instead of comparing the absolute value of Nfg with Ca we determine whether rejection is in the direction of the null or the alternative, there are a total of eight possible test outcomes (see the discussion in Fisher and McAleer (1979) and Dastoor (1981)). This aspect of nonnested hypothesis testing has been criti­cized by some commentators, pointing out the test outcome can lead to ambigui­ties. (See, for example, Granger, King, and White, 1995.) However, this is a valid criticism only if the primary objective is to select a specific model for forecasting or decision making, but not if the aim is to learn about the comparative strengths and weaknesses of rival explanations. What is viewed as a weakness from the perspective of model selection now becomes a strength when placed in the

context of statistical inference and model building. For example, when both models are rejected the analysis points the investigator in the direction of developing a third model which incorporates the main desirable features of the original, as well as being theoretically meaningful. (See Pesaran and Deaton, 1978.)