Time Series, Multivariate and Panel Data

In this section we very briefly present extension from cross section to other types of count data (see Cameron and Trivedi, 1998, for further detail). For time series and multivariate count data many models have been proposed but preferred methods have not yet been established. For panel data there is more agreement in the econometrics literature on which methods to use, though a wider range of models is considered in the statistics literature.

3.2 Time series data

If a time series of count data is generated by a Poisson point process then event occurrences in successive time intervals are independent. Independence is a reasonable assumption when the underlying stochastic process for events, con­ditional on covariates, has no memory. Then there is no need for special time series models. For example, the number of deaths (or births) in a region may be uncorrelated over time. At the same time the population, which cumulates births and deaths, will be very highly correlated over time.

The first step for time series count data is therefore to test for serial correlation. A simple test first estimates a count regression such as Poisson, obtains the residual, usually (yt – exp(xt’P)) where xt may include time trends, and tests for zero correlation between current and lagged residuals, allowing for the complica­tion that the residuals will certainly be heteroskedastic.

Upon establishing the data are indeed serially correlated, there are several models to choose from. An aesthetically appealing model is the INAR(1) model (integer autoregressive model of order one and its generalization to the nega­tive binomial and to higher orders of serial correlation. This model specifies yt = pt ° yt-1 + et, where pt is a correlation parameter with 0 < pt < 1, for example pt = 1/[1 + exp(-zjy)]. The symbol ° denotes the binomial thinning operator, whereby pt ° yt-1 is the realized value of a binomial random variable with probability of success pt in each of yt-1 trials. One may think of each event as having a replica­tion or survival probability of pt in the following period. As in a linear first-order Markov model, this probability decays geometrically. A Poisson INAR(1) model, with a Poisson marginal distribution for yt arises when et is Poisson distributed with mean, say, exp(xjP). A negative binomial INAR(1) model arises if et is nega­tive binomial distributed.

An autoregressive model, or Markov model, is a simple adjustment to the ear­lier cross section count models that directly enters lagged values of y into the formula for the conditional mean of current y. For example, we might suppose yt conditional on current and past xt and past yt is Poisson distributed with mean exp(xt’P + p ln y*-1), where y*-1 is an adjustment to ensure a nonzero lagged value, such as y*-1 = (yt-1 + 0.5) or y*-1 = max(0.5, yt-1).

Serially correlated error models induce time series correlation by introducing unobserved heterogeneity, see Section 3.1, and allowing this to be serially cor­related. For example, yt is Poisson distributed with mean exp(xt’P)vt where vt is a serially correlated random variable (Zeger, 1988).

State space models or time-varying parameters models allow the conditional mean to be a random variable drawn from a distribution whose parameters evolve over time. For example, yt is Poisson distributed with mean pt where pt is a draw from a gamma distribution (Harvey and Fernandes, 1989).

Hidden Markov models specify different parametric models in different regimes, and induce serial correlation by specifying the stochastic process determining which regime currently applies to be an unobserved Markov process (MacDonald and Zucchini, 1997).

Leave a reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>