# Nonlinear Models and Nonlinear Inequality Restrictions

Wolak (1989, 1991) gives a general account of this topic. He considers the general formulation in (25.18) with nonlinear restrictions. Specifically, consider the following problem:

S = P + v h(p) > 0

v ~ N(0, Y) (25.22)

where h( ) is a smooth vector function of dimension p with a derivative matrix denoted by H(). We wish to test

H0 : h(P) > 0, vs. H1 : p Є RK.

This is very general since model classes that allow for estimation results given in (25.22) are very broad indeed. As the results in Potscher and Prucha (1991a, 1991b) indicate, many nonlinear dynamic processes in econometrics permit consistent and asymptotically normal estimators under regularity conditions.

In general an asymptotically exact size test of the null in (25.23) is not possible without a localization to some suitable neighborhood of the parameter space. To see this, let h(P0) = 0 define p0, and H(p0), and 1(P0) the evaluations of dh/Эр = H(P), and the information matrix, respectively. Let

C = {p | h(P) > 0, p Є RK} (25.24)

and NSt(P°) as a 5T-neighborhood with 5T = O( T~ ). It is known that the global

hypotheses of the type in (25.23)-(25.24) do not generally permit large sample approximations to the power function for fixed alternative hypotheses for nonlinear multivariate equality restrictions. See Wolak (1989). If we localize the null in (25.23) to only p Є Ns (P°), then asymptotically exact size tests are available and are as given by the appropriately defined chi-bar distribution. In fact, we would be testing whether p Є {cone of tangents of C at P°}, where W = H(P°)1(P°)-1H(P°), in (25.22).

In order to appreciate the issues, we note that the asymptotic distribution of the test depends on W, which in turn depends on H( ) and!(•). But the latter

generally vary with p°. Also, we note that h(P°) = 0 does not have a unique

solution unless it is linear and of rank K = p. Thus the case of H0 : p > 0 does not have a problem. The case Rp > 0, will present a problem in nonlinear models when rank (R) < K since 1(P) will generally depend on the K – p free parameters in p. It must be appreciated that this is specially serious since inequalities define composite hypotheses which force a consideration of power over regions. Optimal tests do not exist in multivariate situations. Thus other conventions must be developed for test selection. One method is to consider power in the least favorable case. Another is to maximize power at given points that are known to be "desirable", leading to point optimal testing. Closely related, and since such points may be difficult to select a priori, are tests that maximize mean power over sets of desirable points. See King and Wu (1997) for discussion. In the instant case, the "least favorable" cases are all those defined by h(P°) = 0. Hence the indeterminacy of the asymptotic power function.

From this point on our discussion pertains to the "local" test whenever the estimates of W cannot converge to a unique H(P°)1(P°)-1H(P°) = W.

Let xt = (xt1, xt2,…, xtn)’, t = 1,… T, be a realization of a random variable in Rn, with a density function f(xt, P) which is continuous in p for all xt. We assume a compact subspace of Rn contains p, h() is continuous with continuous partial derivatives 3hi(P)/3P;- = Hij, defining the p x K matrix H(P) that is assumed to have full rank p < K at an interior point P° such that h(P°) = 0. Finally, let PT denote the "true" value under the local hypothesis, then,

pT Є Ct = {P | h(p) > 0, P Є NSt(P°)} for all T

and (PT – P°) = o(1) and T2 фT – p0) = 0(1). Let x represent T random observations from f(xt, P), and the loglikelihood function given below:

L(P) = L(x, P) = X ln(f(xt, P)). (25.25)

t=1

where f(P°) is the value of the information matrix, limT^„T-1 E P0 [-Э2./ЭрЭР’], at P°, and (25.27)-(25.28) are two asymptotically equivalent ways of omputing the Wald test. This testifies to its lack of invariance which has been widely appreciated in econometrics. The above results also benefit from the well known asymptotic approximations:

h(P°) – H(p°)(p – p°), and H(p°) – H(p) – 0,

which hold for all of the three estimators of p. As Wolak (1989) shows, these statistics are asymptotically equivalent to the generalized distance statistic D introduced in Kodde and Palm (1986):

D = min T(P – p)’l(p°)(P – p), (25.30)

subject to H(p°)(p – p°) > 0, and p Є Ns (p°).

For local P defined above, all these statistics have the same x 2-distribution given earlier. Kodde and Palm (1987) employ this statistic for an empirical test of the negativity of the substitution matrix of demand systems. They find that it outperforms the two-sided asymptotic LR test. Their bounds also appear to deal with the related problem of overrejection when nominal significance levels are used with other classical tests against the two-sided alternatives. Gourieroux et al. (1982) give the popular artificial regression method of deriving the LM test. In the same general context, Wolak (1989) specializes the above tests to a test of joint nonlinear inequality and equality restrictions.

With the advent of cheap computing and Monte Carlo integration in high dimensions, the above tests are quite accessible. Certainly, the critical values from the bounds procedures deserve to be incorporated in standard econometric routines, as well as the exact bounds for low dimensional cases (p < 8). The power gains justify the extra effort.

## Leave a reply