# Conditional Logit

Conditional logit is used to model choices when there is alternative specific information available. When choosing among brands of soft-drinks, you have information on the choice that an individual makes as well as the price of each available alternative. This kind of data differs from the scanner data used in multinomial logit because in that example we only had information on the grade earned by an individual; there were no alternative grades for those choosing what kind of school to attend. The grade was specific to the individual, not the choices. In conditional logit there is information about each alternative. Models that combine individual specific information and choice specific information are referred to as mixed. Such data are somewhat rare. Usually you either have information on the individual (income or race) or the choices (prices and advertising), but not both.

The following example should make this more clear. We are studying choices among three soft – drinks: Pepsi, Coke, and Seven-up. Each may sell for a different price. Each individual purchases one of the brands. The probability that individual i chooses j is

exp(eij + в2 pricej )

p •• =

j exp(eii + fhpricen) + exp(ei2 + foprice^) + exp(ei3 + fopricXiz)

Now there is only 1 parameter that relates to price, but there are J=3 constants. One of these is not identified and is set to zero. This is referred to as normalization and in our case we set

ei3 = 0.

Below is a function and a script that will estimate the conditional logit model for the soft drink example by maximum likelihood. The function is not general in the sense that it will work with another model, but the basic idea could be used to generalize it to do so. The MCMC method discussed below is an alternative that is more ready for general use, but the results will differ somewhat from maximum likelihood estimation.

The function computes the value of the log-likelihood for the conditional logit model. The inputs consist of two lists and a vector of starting values. The first list contains indicator variables identifying which choice was made (pepsi, 7up or coke). The second list contains the regressors.

————————- Conditional Logit script —————-

1 function matrix clprobs(list y, list x, matrix theta)

2 matrix Y = { y }

3 matrix p = { x }

4 scalar n = \$nobs

5 matrix P = {}

6 loop i=1..n —quiet

7 scalar i1 = exp(theta+theta*p[i,1])

8 scalar i2 = exp(theta+theta*p[i,2])

9 scalar i3 = exp(theta*p[i,3])

10 scalar d = i1+i2+i3

11 matrix pp = (Y[i,1]=1)*i1/d + (Y[i,2]=1)* i2/d + (Y[i,3]=1)* i3/d

12 matrix P = P | pp

13 endloop

14 return sumc(ln(P))

15 end function

16

16 open "@gretldirdatapoecola2.gdt"

17 list y = pepsi sevenup coke

18 list x = pr_pepsi pr_7up pr_coke

19 matrix theta = {-1.19, .283, .1038}

21

20 mle lln = clprobs(y, x, theta)

21 params theta

22 end mle

Lines 2 and 3 convert the lists to matrices. The number of observations is counted in line 4 and an empty matrix is created to hold the result in 5. The loop that starts in line 6 just computes the probabilities for each observed choice. The scalars i1, i2 and i3 are added together for the denominator of equation (16.16); each of these scalars is divided by the denominator term. The logical statements, i. e., (Y[i,1]=1) is multiplied by the probability. If the person chooses the first alternative, this i1/d is set to pp. The other logicals are false at this point and are zero. The vector pp contains the probabilities of making the choice for the alternative actually chosen. The return is the sum of the logs of the probabilities, which is just the log-likelihood.

The results from this function and MLE estimation are found below:

Using numerical derivatives Tolerance = 1.81899e-012 Function evaluations: 41 Evaluations of gradient: 12

Model 2: ML, using observations 1-1822 lln = clprobs(y, x, theta)

Standard errors based on Hessian

estimate std. error z p-value

 theta 0.283166 0.0623772 4.54 5.64e-06 *** theta 0.103833 0.0624595 1.662 0.0964 * theta -2.29637 0.137655 -16.68 1.77e-62 ***

Log-likelihood -1824.562 Akaike criterion 3655.124

Schwarz criterion 3671.647 Hannan-Quinn 3661.220

These match the results in POE4. Even the estimated standard errors are the same out to 4 decimal places. Very good indeed. Substantively, the price coefficient is -2.296 and is significantly different from zero at any reasonable level of significance.