# Aggregate Prediction

We shall consider the problem of predicting the aggregate proportion r = і yt in the QR model (9.2.1). This is often an important practical prob­lem for a policymaker. For example, in the transport choice model of Example

9.2.2, a policymaker would like to know the proportion of people in a commu­nity who use the transit when a new fare and the other values of the indepen­dent variables x prevail. It is assumed that fi (suppressing the subscript 0) has been estimated from the past sample. Moreover, to simplify the analysis, we shall assume for the time being that the estimated value of/? is equal to the true value.

The prediction of r should be done on the basis of the conditional distribu­tion of r given (x,). When n is large, the following asymptotic distribution is accurate: (9.2.62)

Once we calculate the asymptotic mean and variance, we have all the neces­sary information for predicting r.

If we actually observe every xh і = 1, 2,. . . , n, we can calculate the mean and variance for (9.2.62) straightforwardly. However, because it is more real­istic to assume that we cannot observe every x,, we shall consider the problem of how to estimate the mean and variance for (9.2.62) in that situation. For that purpose we assume that {x,} are i. i.d. random variables with the common AT-variate distribution function G. Then the asymptotic mean and variance of r can be estimated by EF(x’fi) and nr lEF( 1 — F), respectively, where E is the expectation taken with respect to G.

Westin (1974) studied the evaluation of FFwhen F= A (logistic distribu­
tion) and x ~ N(fix, Xx). He noted that the density of p = A(x’fi) is given by

M h Ш-"]’}• №263)

where fi = fixfl and a2 = P’Xxp. Because the mean of this density does not have a closed form expression, we must evaluate Spf(p) dp numerically for given values of p. and a2.

McFadden and Reid (1975) showed that if F — Ф (standard normal distri­bution) and x ~ N(px, Xx) as before, we have

ЕФ(х’р) = Ф[( 1 + а2)~і/2р]. (9.2.64)

Thus the evaluation of EF is much simpler than in the logit case.

Neither Westin nor McFadden and Reid considered the evaluation of the asymptotic variance of r, which constitutes an important piece of information for the purpose of prediction.

Another deficiency of these studies is that the variability due to the estima­tion of P is totally ignored. We shall suggest a partially Bayesian way to deal with this problem.5 Given an estimate fi of P, we treat P as a random varia­ble with the distribution N(P, tfi). An estimate of the asymptotic covariance matrix of the estimator P can be used for We now regard (9.2.62) as the asymptotic distribution of r conditionally on {xt} andp. The distribution of r conditionally on {x,} but not on P is therefore given by

r ~ NlEpn-1 2 Fh Efin~2 £ F,(l – Ft) (9.2.65)

L i-i /-і

+ ^-*2^1-

/-і j

Finally, if the total observations on {x,} are not available and {x,} can be regarded as i. i.d. random variables, we can approximate (9.2.65) by

r ~ N[ExEfiF, n-‘ExEpF( 1 — F) + Ex V„F], (9.2.66)