Trend Stationary Versus Difference Stationary
Many macroeconomic time-series that are trending upwards have been characterized as either Trend Stationary: xt = a + /3f + ut (14.10)
Difference Stationary: xt = 7 + xt_1 + ut (14.11)
where ut is stationary. The first model (14.10) says that the macro-series is stationary except for a deterministic trend. E(xt) = a + /3f which varies with f. In contrast, the second model (14.11) says that the macro-series is a random walk with drift. The drift parameter 7 in (14.11) plays the same role as the в parameter in (14.10), since both cause xt to trend upwards over time. Model (14.10) is consistent with economists introducing a time trend in the regression. This has the same effect as detrending each variable in the regression rendering it stationary, see the Frisch-Waugh-Lovell Theorem in Chapter 7. This detrending is valid only if model (14.10) is true for every series in the regression. Model (14.11) on the other hand, requires differencing to obtain a stationary series. Detrending and differencing are two completely different remedies. What is valid for one model is not valid for the other. The choice between (14.10) and (14.11) is based on a test for the existence of a unit root. Essential reading on these two models are Nelson and Plosser (1982) and Stock and Watson (1988). Nelson and Plosser applied the Dickey-Fuller test to a wide range of historical macro time-series for the U. S. economy and found that all of these series were difference stationary, with the exception of the unemployment rate. Plosser and Schwert (1978) argued that for most economic macro time-series, it is best to difference the data rather than work with levels. The reasoning is that if these series are difference stationary and we run regressions in levels, the usual properties of our estimators as well as the distributions of the associated test statistics are invalid. On the other hand, if the true model is a regression in levels with the data series being trend stationary, differencing the model will produce a Moving
Average error term and at worst, ignoring it will lead to loss in efficiency. It is important to emphasize, that for nonstationary variables, the standard asymptotic theory does not apply, see problems 6 and 7, and that t and F-statistics obtained from regressions using these variables may have non-standard distributions, see Durlauf and Phillips (1988).
Granger and Newbold (1974) demonstrated some of the problems associated with regressing nonstationary time-series on each other. In fact, they showed that if xt and yt are independent random walks, then one should expect to find no evidence of a relationship when one regresses yt on xt. In other words, the estimate of 3 in the regression yt = a+/3xt+ut should be near zero and its associated t-statistic insignifiant. In fact, for a sample size of 50 and a 100 replications, Granger and Newbold found t < 2 on only 23 occasions, 2 < t< 4 on 24 occasions, and t > 4 on 53 occasions. Granger and Newbold (1974) called this phenomenon spurious regression since it finds a significant relationship between the two time-series when none exists. Hence, one should be cautious when running time-series regressions involving unit root processes. High R2 and significant t-statistics from OLS regressions may be hiding nonsense results. Phillips (1986) studied the asymptotic properties of the least squares spurious regression model and confirmed these simulation results. In fact, Phillips showed that the t-statistic for Ho; 3 = 0 converges in probability to ж as T ^ <x>. This means that the t-statistic will reject Ho; 3 = 0 with probability 1 as T ^<x>. If both xt and yt are independent trend stationary series generated as described in (14.10), then the R2 of the regression of yt on xt will tend to one as T ^x>, see Davidson and MacKinnon (1993, p. 671). For a summary of several extensions of these results, see Granger (2001).
Let us continue with our consumption-income example. In Chapter 5, we regressed Ct on Yt and obtained
Ct = —1343.31 + 0.979 Yt + residuals (219.56) (0.011)
with R2 = 0.994 and D. W. = 0.18. We have shown that Ct and Yt are nonstationary series and that both are I(1), see also problem 3. The regression in (14.12) could be a spurious regression owing to the fact that we regressed a nonstationary series on another. This invalidates the t and F-statistics of regression (14.12). Since both Ct and Yt are integrated of the same order, and Figure 14.1 shows that they are trending upwards together, this random walk may be in unison. This is the idea behind cointegration. Ct and Yt are cointegrated if there exists a linear combination of Ct and Yt that yields a stationary series. More formally, if Ct and Yt are both I(1) but there exist a linear combination Ct — a — 3Yt = ut which is I(0), then Ct and Yt are cointegrated and 3 is the cointegrating parameter. This idea can be extended to a vector of more than two time-series. This vector is cointegrated if the components of this vector have a unit root and there exists a linear combination of this vector that is stationary. Such a cointegrating relationship can be interpreted as a stable long-run relationship between the components of this time-series vector. Economic examples of long-run relationship include the quantity theory of money, purchasing power parity and the permanent income theory of consumption. The important point to emphasize here is that differencing these nonstationary time-series destroys potential valuable information about the long-run relationship between these economic
variables. The theory of cointegration tries to estimate this long-run relationship using the nonstationary series themselves, rather than their first differences. In order to explain this, we state (without proof) one of the implications of the Granger Representation Theorem, namely, that a set of cointegrated variables will have an Error-Correction Model (ECM) representation. Let us illustrate with an example.
A Cointegration Example
This is based on Engle and Granger (1987). Assume that Ct and Yt for t = 1, 2,…,T are I(1) processes generated as follows:
Ct — /3Yt = ut with ut = put-1 + et and p < 1 (14.13)
Ct — aYt = vt with vt = vt-1 + nt and a = в (14.14)
In other words, ut follows a stationary AR(1) process, while vt follows a random walk. Suppose
that et are independent bivariate normal random variables with mean zero and variance
£ = [&ij] for i, j = 1,2. First, we obtain the reduced form representation of Yt and Ct in terms of ut and vt. This is given by
Since ut is I(0) and vt is I(1), we conclude from (14.15) and (14.16) that Ct and Yt are in fact I(1) series. In terms of the usual order condition for identification considered in Chapter 11, the system of equations given by (14.13) and (14.14) are not identified because there are no exclusion restrictions on either equation. However, if we take a linear combination of the two structural equations given in (14.13) and (14.14), the disturbance of the resulting linear combination is neither a stationary AR(1) process nor a random walk. Hence, both (14.13) and (14.14) are identified. Note that if p = 1, then ut is a random walk and the linear combination of ut and vt is also a random walk. In this case, neither (14.13) nor (14.14) are identified.
In the Engle-Granger terminology, Ct—вYt is the cointegrating relationship and (1, —в) is the cointegrating vector. This cointegrating relationship is unique. The proof is by contradiction. Assume there is another cointegrating relationship Ct — уYt that is I(0), then the difference between the two cointegrating relationships yields (у — в^. This is also I(0). This can only happen for every value of Yt, which is I(1), if and only if в = у.
Difference both equations in (14.13) and (14.14) and write both differenced equations as a system of two equations in (ACt, AYt)’, one gets:
where the second equality is obtained by replacing Avt by nt, Aut by (p — 1)ut-1 + et, and substituting for ut-i its value (Ct-i — вYt-l). Post-multiplying (14.17) by the inverse of the
first matrix, one can show, see problem 9, that the resulting solution is the following VAR model:
where ht and gt are linear combinations of et and nt. Note that if p = 1, then the level variables Ct-i and Yt-i drop from the VAR equations. Let Zt = Ct — /3Yt and define 8 = (p — 1)/(в — a). Then the VAR representation in (14.18) can be written as follows:
ACt = —a8Zt-i + ht (14.19)
AYt = —8Zt-i + gt (14.20)
This is the Error-Correction Model (ECM) representation of the original model. Zt-i is the error correction term. It represents a disequilibrium term showing the departure from long-run equilibrium, see section 6.4. Note that if p = 1, then 8 = 0 and Zt-i drops from both ECM equations. As Banerjee et al. (1993, p. 139) explain, this ECM representation is a noteworthy “…contribution to resolving, or synthesizing, the debate between time-series analysts and those favoring econometric methods.” The former considered only differenced time-series that can be legitimately assumed stationary, while the latter focused on equilibrium relationships expressed in levels. The former wiped out important long-run relationships by first differencing them, while the latter ignored the spurious regression problem. In contrast, the ECM allows the use of first differences and levels from the cointegrating relationship. For more details, see Banerjee et al. (1993). A simple two-step procedure for estimating cointegrating relationships is given by Engle and Granger (1987). In the first step, the OLS estimator of в is obtained by regressing Ct on Yt. This can be shown to be superconsistent, i. e., plim T(/3OLS — в) ^ 0 as T ^<x>. Using Pols one obtains Zt = Ct — eOLSYt. In the second step, using Zt-i rather than Zt-i, apply OLS to estimate the ECM in (14.19) and (14.20). Extensive Monte Carlo experiments have been conducted by Banerjee et al. (1993) to investigate the bias of в in small samples. This is pursued further in problem 9. An alternative estimation procedure is the maximum likelihood approach suggested by Johansen (1988). This is beyond the scope of this book. See Dolado et al. (2001) for a lucid summary of the cointegration literature.
A formal test for cointegration is given by Engle and Granger (1987) who suggest running regression (14.12) and testing that the residuals do not have a unit root. In other words, run a Dickey-Fuller test or its augmented version on the resulting residuals from (14.12). In fact, if Ct and Yt are not cointegrated, then any linear combination of them would be nonstationary including the residuals of (14.12). Since these tests are based on residuals, their asymptotic distributions are not the same as those of the corresponding ordinary unit roots tests. Asymptotic critical values for these tests can be found in Davidson and MacKinnon (1993, p. 722). For our consumption regression the following Dickey-Fuller test is obtained on the residuals:
A3t = —1.111 — 0.094 3t + residuals (0.04) (1.50)
the Davidson and MacKinnon (1993) asymptotic 5% critical value for this f-statistic is —2.92 and its p-value is 0.52. Therefore, we cannot reject the hypothesis that 3t is nonstationary. We have also included a trend and one lag of the first-differenced residuals. The resulting augmented Dickey-Fuller test did not reject the existence of a unit root. Therefore, Ct and Yt are not cointegrated. This suggests that the relationship estimated in (14.12) is spurious.
Table 14.2 Johansen Cointegration Test
Sample (adjusted): 1961 2007
Included observations: 47 after adjustments
Trend assumption: Linear deterministic trend
Series: CONSUMP Y
Lags interval (in first differences): 1 to 1
Unrestricted Cointegration Rank Test (Trace)
Trace test indicates no cointegration at the 0.05 level * denotes rejection of the hypothesis at the 0.05 level ** MacKinnon-Haug-Michelis (1999) p-values
Unrestricted Cointegration Rank Test (Maximum Eigenvalue)
Max-eigenvalue test indicates no cointegration at the 0.05 level * denotes rejection of the hypothesis at the 0.05 level ** MacKinnon-Haug-Michelis (1999) p-values
Regressing an I(1) series on another lead to spurious results unless they are cointegrated. Of course, other I(1) series may have been erroneously excluded from (14.12) which when included may result in a cointegrating relationship among the resulting variables. In other words, Ct and Yt may not be cointegrated because of an omitted variables problem. Table 14.2 gives the Johansen (1995) cointegration test reported by EViews which is beyond the scope of this book. The null hypothesis is that of no cointegration or at most one cointegration relationship. Both hypotheses are not rejected by the trace and maximum eigenvalue tests.