The simple linear regression models of chapter 2 and the multiple regression model in Chapter 5 can be generalized in other ways. For instance, there is no guarantee that the random variables of these models (either the yi or the ei) have the same inherent variability. That is to say, some observations may have a larger or smaller variance than others. This describes the condition known as heteroskedasticity. The general linear regression model is shown in equation (8.1) below.
yi = ві + в2Хі2 +——- + вкXiK + ei i = 1,2,… ,N (8.1)
where yi is the dependent variable, xik is the ith observation on the kth independent variable, k = 2,3,…, K, ei is random error, and въ в2,…, вК are the parameters you want to estimate. Just as in the simple linear regression model, ei, have an average value of zero for each value of the independent variables and are uncorrelated with one another. The difference in this model is that the variance of ei now depends on i, i. e., the observation to which it belongs. Indexing the variance with the i subscript is just a way of indicating that observations may have different amounts of variability associated with them. The error assumptions can be summarized as ei|xi2,Xi3,.. .XiK iid N(0,CT2).
The intercept and slopes, в1, в2, …, вК, are consistently estimated by least squares even if the data are heteroskedastic. Unfortunately, the usual estimators of the least squares standard errors and tests based on them are inconsistent and invalid. In this chapter, several ways to detect heteroskedasticity are considered. Also, statistically valid ways of estimating the parameters of 8.1 and testing hypotheses about the вs when the data are heteroskedastic are explored.