# Multivariate Models

9.3

9.4.1 Introduction

A multivariate QR model specifies the joint probability distribution of two or more discrete dependent variables. For example, suppose there are two binary dependent variables yx and y2, each of which takes values I or 0. Their joint distribution can be described by Table 9.1, where Pjk = P(yx =j, y2 = k). The model is completed by specifying Px i, Pl0, and P0l as functions of independent variables and unknown parameters. {Рж is determined as one minus the sum of the other three probabilities.)

A multivariate QR model is a special case of a multinomial QR model. For example, the model represented by Table 9.1 is equivalent to a multinomial model for a single discrete dependent variable that takes four values with probabilities Pn, P10, P0l, and Poo – Therefore the theory of statistical infer­ence discussed in regard to a multinomial model in Section 9.3.1 is valid for a multivariate model without modification.

In the subsequent sections we shall present various types of multivariate QR models and show in each model how the probabilities are specified to take account of the multivariate features of the model.

Before going into the discussion of various ways to specify multivariate QR models, we shall comment on an empirical article that is concerned with a

Table 9.1 Joint distribution of two binary random variables

 Уг У 1 0 1 p„ Pio 0 Po,

multivariate model but in which its specific multivariate features are (perhaps justifiably) ignored. By doing this, we hope to shed some light on the distinc­tion between a multivariate model and the other multinomial models.

Example 9.4.1 (Silberman and Durden, 1976). Silberman and Durden analyzed how representatives voted on two bills (House bill and Substitute bill) concerning the minimum wage. The independent variables are the socio­economic characteristics of a legislator’s congressional district, namely, the campaign contribution of labor unions, the campaign contribution of small businesses, the percentage of persons earning less than \$4000, the percentage of persons aged 16-21, and a dummy for the South. Denoting House bill and Substitute bill by H and S, the actual counts of votes on the bills are given in Table 9.2.

The zero count in the last cell explains why Silberman and Durden did not set up a multivariate QR model to analyze the data. Instead, they used an ordered probit model by ordering the three nonzero responses in the order of a representative’s feeling in favor of the minimum wage as shown in Table 9.3. Assuming that у* is normally distributed with mean linearly dependent on the independent variables, Silberman and Durden specified the probabilities as

Рю = Ф(х’0) (9.4.1)

and

Ріо + Рц = Ф(х0 + а), (9.4.2)

where a > 0.

An alternative specification that takes into consideration the multivariate nature of the problem and at the same time recognizes the zero count of the last cell may be developed in the form of a sequential probit model (Section 9.3.10) as follows:

Р(Н, = Уе8) = Ф(х;/?1) (9.4.3)

Table 9.2 Votes on House (H) and Substitute (S) bills

 Vote for Substitute Vote for House bill bill Yes No Yes 75 119 No 205 0 P(S, = No|H і = Yes) = Ф(х’Д,). (9.4.4)

The choice between these two models must be determined, first, from a theoretical standpoint based on an analysis of the legislator’s behavior and, second, if the first consideration is not conclusive, from a statistical standpoint based on some measure of goodness of fit. Because the two models involve different numbers of parameters, adjustments for the degrees of freedom, such as the Akaike Information Criterion (Section 4.5.2), must be employed. The problem boils down to the question, Is the model defined by (9.4.3) and (9.4.4) sufficiently better than the model defined by (9.4.1) and (9.4.2) to compensate for a reduction in the degrees of freedom?