Category Using gret l for Principles of Econometrics, 4th Edition

School Fixed Effects

It may be that assignment to treatment groups is related to one or more of the observable characteristics (school size or teacher experience in this case). One way to control for these omitted effects is to used fixed effects estimation. This is taken up in more detail later. Here we introduce it to show off a useful gretl function called dummify.

The dummify command creates dummy variables for each distinct value present in a series, x. In order for it to work, you must first tell gretl that x is in fact a discrete variable. We want to create a set of indicator variables, one for each school in the dataset. First declare the schid variable to be discrete and then dummify it.

Here is the code and another model table that mimics Table 7.7 in POE4.

1 discrete schid

2 list d = dummify(schid)

Read More

IV Estimation

Gretl handles this estimation problem with ease using what is commonly referred to as two – stage least squares. In econometrics, the terms two-stage least squares (TSLS) and instrumental variables (IV) estimation are often used interchangeably. The ‘two-stage’ terminology is a legacy of the time when the easiest way to estimate the model was to actually use two separate least squares regressions. With better software, the computation is done in a single step to ensure the other model statistics are computed correctly. Since the software you use invariably expects you to specify ‘instruments,’ it is probably better to think about this estimator in those terms from the beginning...

Read More

Monte Carlo Experiment

Once again, the consequences of repeated sampling can be explored using a simple Monte Carlo study. In this case, we will generate 100 samples and count the number of times the confidence interval includes the true value of the parameter. The simulation will be based on the food. gdt dataset.

The new script looks like this:

1 open "@gretldirdatapoefood. gdt"

2 set seed 3213798

3 loop 100 —progressive —quiet

4 series u = normal(0,88)

5 series y = 80 + 10*income + u

6 ols y const income

7

7 scalar c1L = $coeff(const) – critical(t,$df,.025)*$stderr(const)

8 scalar c1R = $coeff(const) + critical(t,$df,.025)*$stderr(const)

9 scalar c2L = $coeff(income) – critical(t,$df,.025)*$stderr(income)

10 scalar c2R = $coeff(income) + critical(t,$df,.025)*$stderr(income)

12

11 # Compute the coverage pr...

Read More

Further Inference in the Multiple Regression Model

In this chapter several extensions of the multiple linear regression model are considered. First, we test joint hypotheses about parameters in a model and then learn how to impose linear restric­tions on the parameters. A condition called collinearity is also explored.

6.1 F-test

An F-statistic can be used to test multiple hypotheses in a linear regression model. In linear regression there are several different ways to derive and compute this statistic, but each yields the same result. The one used here compares the sum of squared errors (SSE) in a regression model estimated under the null hypothesis (H0) to the SSE of a model under the alternative (Hi). If the sum of squared errors from the two models are similar, then there is not enough evidence to reject the restrictions...

Read More

Heteroskedasticity in the Linear Probabilty Model

In chapter 7 we introduced the linear probability model. It was shown that the indicator variable, yi is heteroskedastic. That is,

var(yi) = пі(1 – пі) (8.10)

where ni is the probability that the dependent variable is equal to 1 (the choice is made). The estimated variance is

var(yi) = Пі(1 – Пі) (8.11)

This can be used to perform feasible GLS. The cola marketing data coke. gdt is the basis for this example. The independent variable, coke, takes the value of 1 if the individual purchases Coca-Cola and is 0 if not. The decision to purchase Coca-Cola depends on the ratio of the price relative to Pepsi, and whether displays for Coca-Cola or Pepsi were present...

Read More