# Goodness of Fit Measures

There are problems with the use of conventional A2-type measures when the explained variable y takes on two values, see Maddala (1983, pp. 37-41). The predicted values р are probabilities

and the actual values of y are either 0 or 1 so the usual R2 is likely to be very low. Also, if there

is a constant in the model the linear probability and logit models satisfy y% = Sn=i рі- However, the probit model does not necessarily satisfy this exact relationship although it is approximately valid.

Several R2-type measures have been suggested in the literature, some of these are the follow­ing:

(ii) Measures based on the residual sum of squares: Effron (1978) suggested using

Rl = 1 – [£0= і(У – Уг)2/ ТГг=і(Уг – У)2] = 1 – [n £0=^ – У^/щщ]

since – У)2 = Тл=1 УІ – ny2 = пі – n(ni/n)2 = П1П2/П, where ni = £)£ Уг and

n2 = n – n1.

Amemiya (1981, p. 1504) suggests using [£0^^ – уг)2/уг(1 – уг)] as the residual sum of squares. This weights each squared error by the inverse of its variance.

(iii) Measures based on likelihood ratios: R2 = 1 – (£r/£u)2/n where £r is the restricted likeli­hood and £u is the unrestricted likelihood. This tests that all the slope coefficients are zero in the standard linear regression model. For the limited dependent variable model however, the likelihood function has a maximum of 1. This means that £r < £u < 1 or £r < (£r/£u) < 1 or £jn < 1 – R2 < 1 or 0 < R3 < 1 – £jn. Hence, Cragg and Uhler (1970) suggest a pseudo-R2 that lies between 0 and 1, and is given by R2 = (£Ujn – &n)/[(1 – &n)/£2Jn]. Another measure suggested by McFadden (1974) is R5 = 1 – (log£u/log£r).

(iv) Proportion of correct predictions: After computing y, one classifies the г-th observation as a success if уг > 0.5, and a failure if у < 0.5. This measure is useful but may not have enough discriminatory power.