# Multivariate data

In some data sets more than one count is observed. For example, data on the utilization of several different types of health service, such as doctor visits and hospital days, may be available. Joint modeling will improve efficiency and pro­vide richer models of the data if counts are correlated.

Most parametric studies have used the bivariate Poisson. This model, however, is too restrictive as it implies variance-mean equality for the counts and restricts the correlation to be positive. Development of better parametric models is a current area of research.

3.3 Panel data

One of the major and earliest applications of count data methods in econometrics is to panel data on the number of patents awarded to firms over time (Hausman et al., 1984). The starting point is the Poisson regression model with exponential conditional mean and multiplicative individual-specific term

yit ~ P[a;exp(x)tP)j, i = 1,…, n, t = 1,…, T, (15.18)

where we consider a short panel with T small and n ^ ^. As in the linear case, both fixed effects and random effects models are possible.

The fixed effects model lets ai be an unknown parameter. This parameter can be eliminated by quasi-differencing and modeling the transformed random variable yit – (Xit/’i)yi, where ‘; and yi denote the individual-specific means of Xit and yit. By construction this has zero mean, conditional on xn,…, xiT. A moments-based estimator of в then solves the sample moment condition Ii=1 Ц1 x;-f(yit – (Xit/’i)Ei) = 0.

An alternative to the quasi-differencing approach is the conditional likelihood approach that was followed by Hausman et al. (1984). In this approach the fixed effects are eliminated by conditioning the distribution of counts on ^=1 yit.

The random effects model lets ai be a random variable with specified distribution that depends on parameters, say 5. The random effects are integrated out, in a similar way to the unobserved heterogeneity in Section 3.1, and the parameters в and 5 are estimated by maximum likelihood. In some cases, notably when ai is gamma distributed, a closed-form solution is obtained upon integrating out ai. In other cases, such as normally distributed random effects, a closed-form solution is not obtained, but ML estimation based on numerical integration is feasible.

Dynamic panel data models permit the regressors x to include lagged values of y. Several studies use the fixed effects variant of (15.18), where xit now includes, for example, yit-1. This is an autoregressive count model, see Section 5.1, adapted to panel data. The quasi-differencing procedure for the nondynamic fixed effects case can be adapted to the dynamic case.