# Predictions in the Log-linear Model

In this example, you use the regression to make predictions about the log wage and the level of the wage for a person having 12 years of schooling. The naive prediction of wage merely takes the antilog of the predicted ln(wage). This can be improved upon by using properties of lognormal random variables. It can be shown that if ln(w) N(y.,a2) then E(w) = e^+a2/2 and

var(w) = e2^2 (ef2 — 1).

That means that the corrected prediction is yc = exp(bi + b2x + <t2/2) = e(bl+b2a:)e<j2/2. The script to generate these is given below.

1 open "@gretldirdatapoecps4_small. gdt"

2 logs wage

3 ols l_wage const educ

4 scalar l_wage_12 = $coeff(const)+$coeff(educ)*12

5 scalar nat_pred = exp(l_wage_12)

6 scalar corrected_pred = nat_pred*exp($sigma"2/2)

7 print l_wage_12 nat_pred corrected_pred

The results from the script are

l_wage_12 = 2.6943434

nat_pred = 14.795801

corrected_pred = 16.996428

That means that for a worker with 12 years of schooling the predicted wage is $14.80/hour using the natural predictor and $17.00/hour using the corrected one. In large samples we would expect the corrected predictor to be a bit better. Among the 1000 individuals in the sample, 328 of them have 12 years of schooling. Among those, the average wage is $15.99. Hence the corrected prediction overshoots by about a dollar/hour. Still, it is closer than the uncorrected figure.

To get the average wage for those with 12 years of schooling, we can restrict the sample using the script below:

smpl educ=12 —restrict summary wage smpl full

The syntax is relatively straightforward. The smpl command instructs gretl that something is being done to the sample. The second statement educ=12 is a condition that gretl looks for within the sample. The –restrict option tells gretl what to do for those observations that satisfy the condition. The summary wage statement produces

Summary Statistics, using the observations 1-328

for the variable wage (328 valid observations)

## Leave a reply