# School Fixed Effects

It may be that assignment to treatment groups is related to one or more of the observable characteristics (school size or teacher experience in this case). One way to control for these omitted effects is to used fixed effects estimation. This is taken up in more detail later. Here we introduce it to show off a useful gretl function called dummify.

The dummify command creates dummy variables for each distinct value present in a series, x. In order for it to work, you must first tell gretl that x is in fact a discrete variable. We want to create a set of indicator variables, one for each school in the dataset. First declare the schid variable to be discrete and then dummify it.

Here is the code and another model table that mimics Table 7.7 in POE4.

1 discrete schid

2 list d = dummify(schid)

3 ols totalscore x1 —quiet

4 scalar sser = \$ess

5 scalar r_df = \$df

7 ols totalscore x2 —quiet

9 ols totalscore x1 d —quiet

10 scalar sseu = \$ess

11 scalar u_df = \$df

13 ols totalscore x2 d –quiet

15 modeltab show

16 modeltab free

The discrete function in line 1 makes schid into a discrete variable. The next line creates a list that includes each of the variables created by dummify(schid). Then, all you have to do is add it to the variable list that includes the fixed effects. Gretl smartly avoids the dummy variable trap by dropping one of the indicator variables from the regression.

Here is what you get with the indicator coefficients suppressed:

OLS estimates

Dependent variable: totalscore

 (1) (2) (3) (4) const 918.0** 907.6** 838.8** 830.8* (1.667) (2.542) (11.56) (11.70)

small   tchexper     School Effects no

Standard errors in parentheses * indicates significance at the 10 percent level ** indicates significance at the 5 percent level

The estimated slopes in columns (3) and (4) match those in POE4. The intercepts are different only because a different reference group was used. The substance of the results is unaffected.

Testing the null hypothesis that the fixed effects are zero is very simple. Compare the restricted and unrestricted sum of squared errors using a F-statistic. The restricted sum of squared errors is saved for model (1) and the unrestricted for model (3). The statistic is computed using

1 scalar J = r_df-u_df

2 scalar fstat = ((sser – sseu)/J)/(sseu/u_df)

3 pvalue f J u_df fstat and the result is:

Generated scalar J = 78 Generated scalar fstat = 14.1177

F(78, 3663): area to the right of 14.1177 = 1.70964e-154 (to the left: 1)

Notice how the difference in the number of degrees of freedom reveals how many restrictions are imposed on the model. Given the number of times we’ve used this computation, it may pay to write a gretl function to automate it.