# Poisson Regression

The Poisson is the starting point for count data analysis, though it is often inadequate. In Sections 2.1-2.3 we present the Poisson regression model and estimation by maximum likelihood, interpretation of the estimated coefficients, and extensions to truncated and censored data. Limitations of the Poisson model, notably overdispersion, are presented in Section 2.4.

1.1 Poisson MLE

The natural stochastic model for counts is a Poisson point process for the occur­rence of the event of interest. This implies a Poisson distribution for the number of occurrences of the event, with density

e ~£uy

Pr[Y = y] = e £ , y = 0, 1, 2,…, (15.1)

У!

where p is the intensity or rate parameter. We refer to the distribution as P [p]. The first two moments are

E [Y ] = p,

У [Y ] = p. (15.2)

This shows the well known equality of mean and variance property of the Poisson distribution.

By introducing the observation subscript i, attached to both y and p, the frame­work is extended to non-iid data. The Poisson regression model is derived from the Poisson distribution by parameterizing the relation between the mean parameter p and covariates (regressors) x. The standard assumption is to use the exponen­tial mean parameterization,

p, = exp(x’P), i = 1,…, n, (15.3)

where by assumption there are k linearly independent covariates, usually includ­ing a constant. Because У[y;|x;] = exp(x’P), by (15.2) and (15.3), the Poisson regression is intrinsically heteroskedastic.

Given (15.1) and (15.3) and the assumption that the observations (yi |x;) are independent, the most natural estimator is maximum likelihood (ML). The loglikelihood function is

ln L(e) = X{y-*?P – exp(x\$) – ln у!}. (15.4)

i=1

The Poisson MLE (maximum likelihood estimation), denoted SP, is the solution to k nonlinear equations corresponding to the first-order condition for maximum likelihood,

X y – exp(x\$)K = °. (15.5)

i=1

If x, includes a constant term then the residuals yi – exp(x • P) sum to zero by (15.5). The loglikelihood function is globally concave; hence solving these equations by

Gauss-Newton or Newton-Raphson iterative algorithm yields unique parameters estimates.

By standard maximum likelihood theory of correctly specified models, the estimator is consistent for в and asymptotically normal with the sample covariance matrix

 , МЛ*;

v1

i=1

in the case where p is of the exponential form (15.3). In practice an alternative more general form for the variance matrix should be used; see Section 4.1.