# The Johansen’s method

Johansen (1995) develops a maximum likelihood estimation procedure based on the so-called reduced rank regression method that, as the other methods to be later discussed, presents some advantages over the two-step regression procedure described in the previous section. First, it relaxes the assumption that the cointegrating vector is unique, and, second, it takes into account the short-run dynamics of the system when estimating the cointegrating vectors. The under­lying intuition behind Johansen’s testing procedure can be easily explained by means of the following example. Assume that yt has a VAR(1) representation, that is, A(L) in (30.13) is such that A(L) = In – A1L. Hence, the VAR(1) process can be reparameterized in the VECM representation as

AVt = (A1 – In)yt-1 + є t. (30.14)

If A1 – In = – A(1) = 0, then yt is I(1) and there are no cointegrating relationships (r = 0), whereas if rank(A1 – In) = n, there are n cointegrating relationships among the n series and hence yt ~ I(0). Thus, testing the null hypothesis that the number of cointegrating vectors (r) is equivalent to testing whether rank(A1 – In) = r. Likewise, alternative hypotheses could be designed in different ways, e. g. that the rank is (r + 1) or that it is n.

Under the previous considerations, Johansen (1995) deals with the more gen­eral case where yt follows a VAR(p) process of the form

yt = A1yt-1 + A2yt-2 + … + Apyt-p + є t, (3°.15)

which, as in (30.4) and (30.4′), can be rewritten in the ECM representation

Ayt = D1Ayt-1 + D2Ayt-2 + … + Dp-1Ayt-p+1 + Dyt-1 + є t. (30.16)

Where Di = -(Ai+1 + … + Ap), i = 1, 2,…, p – 1, and D = (A1 + … + Ap – In) = – A(1) = – БГ’. To estimate Б and Г, we need to estimate D subject to some identification restriction since otherwise Б and Г could not be separately identified. Maximum likelihood estimation of D goes along the same principles of the basic partitioned regression model, namely, the regressand and the regressor of interest (Ayt and yt-1) are regressed by OLS on the remaining set of regressors (Ayt-1,…, Ayt-p+1) giving rise to two matrices of residuals denoted as e0 and e1 and the regression model eot = De1t + residuals. Following the preceding discussion, Johansen (1995) shows that testing for the rank of D is equivalent to test for the number of canonical correlations between e0 and ё1 that are different from zero. This can be conducted using either of the following two test statistics

K(r) = – T t ln(1 – >i) (30.17)

i=r+1

Wr, r + 1) = – T ln (1 – >r+1), (30.18)

where the >is are the eigenvalues of the matrix S10S00S01 with respect to the matrix Sn, ordered in decreasing order (1 > > 1 >… > >n > 0), where Sij = T-1’LTt=1eite’jt, i, j = 0, 1. These eigenvalues can be obtained as the solution of the determinantal equation

|^SU – S10S00S011 = 0. (30.19)

The statistic in (30.17), known as the trace statistic, tests the null hypothesis that the number of cointegrating vectors is less than or equal to r against a general alternative. Note that, since ln(1) = 0 and ln(0) tends to -^, it is clear that the trace statistic equals zero when all the > is are zero, whereas the further the eigenvalues are from zero the more negative is ln(1 – >;) and the larger is the statistic. Like­wise, the statistic in (30.18), known as the maximum eigenvalue statistic, tests a null of r cointegrating vectors against the specific alternative of r + 1. As above, if >r+1 is close to zero, the statistic will be small. Further, if the null hypothesis is not rejected, the r cointegrating vectors contained in matrix Г can be estimated as the first r columns of matrix V = (H1,…, Hn) which contains the eigenvectors associated to the eigenvalues in (30.19) computed as

(^iS11 – S10S00S01)Hi = 0, i = 1, 2, . . . , n

subject to the length normalization rule V’S11V = In. Once Г has been estimated, estimates of the B, Di, and 1 matrices in (30.16) can be obtained by inserting Г in their corresponding OLS formulae which will be functions of Г.

Osterwald-Lenum (1992) has tabulated the critical values for both tests using Monte Carlo simulations, since their asymptotic distributions are multivariate f(B) which depend upon: (i) the number of nonstationary components under the null hypothesis (n – r) and (ii) the form of the vector of deterministic compon­ents, p (e. g. a vector of drift terms), which needs to be included in the estimation of the ECM representation where the variables have nonzero means. Since, in order to simplify matters, the inclusion of deterministic components in (30.16) has not been considered so far, it is worth using a simple example to illustrate the type of interesting statistical problems that may arise when taking them into account. Suppose that r = 1 and that the unique cointegrating vector in Г is normalized to be P’ = (1, p22,…, Pnn), while the vector of speed of adjustment parameters, with which the cointegrating vector appears in each of the equations for the n variables, is a’ = (a11,… ann). If there is a vector of drift terms p’ = (p1,… pn) such that they satisfy the restrictions p, = a11p1 (with a11 = 1), it then follows that all Ayit in (30.16) are expected to be zero when y1t-1 + p22Vi, t-1 + •.. + вппУп, і-і + Mr = 0 and, hence, the general solution for each of the {yit} processes, when integrated, will not contain a time trend. Many other possibilities, like, for example, allowing for a linear trend in each variable but not in the cointegrating relations, may be considered. In each case, the asymptotic distribution of the cointegration tests given in (30.17) and (30.18) will differ, and the corresponding sets of simulated critical values can be found in the reference quoted above. Sometimes, theory will guide the choice of restrictions; for example, if one is considering the relation between short-term and long-term interest rates, it may be wise to impose the restriction that the processes for both interest rates do not have linear trends and that the drift terms are restricted to appear in the cointegrating relationship interpreted as the "term structure." However, in other instances one may be interested in testing alternative sets of restrictions on the way M enters the system; see, e. g. Chapter 32 by Lutkepohl in this volume for further details.

In that respect, the Johansen’s approach allows to test restrictions on m, B, and Г subject to a given number of cointegrating relationships. The insight to all these tests, which turn out to have asymptotic chi-square distributions, is to compare the number of cointegrating vectors (i. e. the number of eigenvalues which are significantly different from zero) both when the restrictions are imposed and when they are not. Since if the true cointegration rank is r, only r linear com­binations of the variables are stationary, one should find that the number of cointegrating vectors does not diminish if the restrictions are not binding and vice versa. Thus, denoting by >i and X* the set of r eigenvalues for the unre­stricted and restricted cases, both sets of eigenvalues should be equivalent if the restrictions are valid. For example, a modification of the trace test in the form

T X [ln(1 – X*) – ln (1 – >)] (30.20)

i=1

will be small if the X*s are similar to the > i s, whereas it will be large if the X*s are smaller than the >is. If we impose s restrictions, then the above test will reject the null hypothesis if the calculated value of (30.20) exceeds that in a chi-square table with r(n – s) degrees of freedom.

Most of the existing Monte Carlo studies on the Johansen methodology point out that dimension of the data series for a given sample size may pose particular problems since the number of parameters of the underlying VAR models grows very large as the dimension increases. Likewise, difficulties often arise when, for a given n, the lag length of the system, p, is either over – or under-parameterized. In particular, Ho and Sorensen (1996) and Gonzalo and Pitarakis (1999) show, by numerical methods, that the cointegrating order will tend to be overestimated as the dimension of the system increases relative to the time dimension, while seri­ous size and power distortions arise when choosing too short and too long a lag length, respectively. Although several degrees of freedom adjustments to improve the performance of the test statistics have been advocated (see, e. g. Reinsel and Ahn, 1992), researchers ought to have considerable care when using the Johansen estimator to determine cointegration order in high dimensional systems with small sample sizes. Nonetheless, it is worth noticing that a useful approach to reduce the dimension of the VAR system is to rely upon exogeneity arguments to construct smaller conditional systems as suggested by Ericsson (1992) and Johansen (1992a). Equally, if the VAR specification is not appropriate, Phillips (1991) and Saikkonen (1992) provide efficient estimation of cointegrating vectors in more general time series settings, including vector ARMA processes.