# Heteroskedasticity

William E. Griffiths*

1 Introduction

A random variable y is said to be heteroskedastic if its variance can be different for different observations. Conversely, it is said to be homoskedastic if its vari­ance is constant for all observations. The most common framework in which heteroskedasticity is studied in econometrics is in the context of the general linear model

Уі = x? P + еи (4.1)

where xi is a ^-dimensional vector of observations on a set of explanatory vari­ables, в is a ^-dimensional vector of coefficients which we wish to estimate and yi denotes the ith observation (i = 1, 2,…, N) on a dependent variable. In a heteroskedastic model the error term ei is assumed to have zero-mean and vari­ance a2, the i subscript on a2 reflecting possibly different variances for each observation. Conditional on x’P, the dependent variable yi has mean x-в and variance a2. Thus, in the heteroskedastic general linear model, the mean and variance of the random variable y can both change over observations.

Heteroskedasticity can arise empirically, through theoretical considerations and from model misspecification. Empirically, heteroskedasticity is often encountered when using cross-sectional data on a number of microeconomic units such as firms or households. A common example is the estimation of household expendi­ture functions. Expenditure on a commodity is more easily explained by conven­tional variables for households with low incomes than it is for households with high incomes. The lower predictive ability of the model for high incomes can be captured by specifying a variance a2 which is larger when income is larger. Data on firms invariably involve observations on economic units of varying sizes. Larger firms are likely to be more diverse and flexible with respect to the way in which values for yi are determined. This additional diversity is captured through an

error term with a larger variance. Theoretical considerations, such as randomness in behavior, can also lead to heteroskedasticity. Brown and Walker (1989, 1995) give examples of how it arises naturally in demand and production models. Moreover, as chapter 19 in this volume (by Swamy and Tavlas) illustrates, heteroskedasticity exists in all models with random coefficients. Misspecifications such as incorrect functional form, omitted variables and structural change are other reasons that a model may exhibit heteroskedasticity.

In this chapter we give the fundamentals of sampling theory and Bayesian estimation, and sampling theory hypothesis testing, for a linear model with heteroskedasticity. For sampling theory estimation it is convenient to first de­scribe estimation for a known error covariance matrix and to then extend it for an unknown error covariance matrix. No attempt is made to give specific details of developments beyond what we consider to be the fundamentals. However, references to such developments and how they build on the fundamentals are provided. Autoregressive conditional heteroskedasticity (ARCH) which is popular for modeling volatility in time series is considered elsewhere in this volume and not discussed in this chapter.