Econometric Models for Credit Underwriting and Default

Traditional credit risk models on the mortgage market employ a parametric approach to estimate regression of the default probability. These are classical binary choice models (probit and logit).

The idea is that we have a regression model:

У* = x/0 + " (1)

where

y*—a latent variable, which is not observed, xi—a vector of independent variables,

"—a vector of constant coefficients,

"i—error term.

We observe a dichotomous variable yi defined by

l, if y* >0, 0, otherwise.

In other words, y; is the PD, taking the value 1 or zero. In the process of credit underwriting y* would be defined as a propensity to receive approval from a credit organization. 1, if the borrower is defaulted, 0, otherwise.

The probit and logit models differ in the specification of distributional form of the error term £ in (2).  If it is a normal distribution, we have a probit model.

The problem of disproportionate sampling occurs in the credit risk modeling. The number of defaulted borrowers would be much smaller than the number of non-defaulted ones. However, if we use the logit or the probit model, or even the liner probability model, the estimated coefficients are not affected by the unequal sampling rates for two groups. It is only the constant term that is affected (Maddala 1992).

Sometimes, the sample is limited by censoring or truncation. This occurs when we observe the independent variables for the entire sample, but for some observa­tions we have only limited information about the dependent variable. Assume the censored variable yi is defined as У*, if y* > 0, 0, if y* < 0.

The error term in (1) is normally distributed £ ~N (0, a2). This model is known as the tobit model (Tobin’s probit) or a censored normal regression model (Tobin 1958). Tobin (1958) studied household expenditures on durable goods. Consumers maximize utility by purchasing durable goods under the constraint that total expenditures do not exceed income (Long 1997). A similar case is expenditures on credit products, like a mortgage.

In the case of the truncated regression model, we have no data either y* or x; for some observations because no samples are drawn if y* is below or above a certain level £ (Maddala 1992).

However, all above-listed models suffer from a sample selection bias, leading to biased estimation. The first reason is the sample selection bias, owing to a simultaneity bias. This problem arises when default modeling does not take into consideration the underwriting process. The decision of approval or decline of a credit application is based on the latter process. Moreover, the truncation or the partial observability causes this bias. We are faced with this issue when information about denied applicants is absent. Therefore, the magnitude of bias depends on the degree of correlation between two processes—the default process and the credit underwriting process. In addition, data completeness including credit history is included in the dataset of defaulted and no-defaulted borrowers’ affects on the bias (Ross 2000).

For a long time, the lack of public available mortgage data obstructed the implementation of studies on credit risk modeling for the mortgage market. For this reason, in earlier works small data from private sources of information were used. Modeling credit risk in the mortgage market with the correction for the sample selection bias was complicated because of lack of information about both groups of applicants—approved and declined. Furthermore, to solve the issue of sample selection bias, the affected factors are required only on the credit underwriting process, but not on the probability of default in the sample.

The pioneer works of Heckman (1976, 1979) propose the self-selection model, which has an endogenous discrete variable. It generalizes the tobit and truncated regression models by explicitly modeling the mechanism that selects observations as being censored or uncensored (Long 1997).

The aim is to estimate the model:

Уі = x[f + Si (7)

Assume that y* is observed not when y* exceeds particular threshold £ as in the tobit and truncated regression models, but based on the value of a second latent variable z* /

Zi = w-a + Ui

where

z*—a second latent variable, which is not observed,

—a vector of independent variables, which can have ^ variables in common, a—a vector of constant coefficients, ui—error term.

For now, we assume that y is observed only when unobserved latent z* variable exceeds a particular threshold £. In a simple case £ = 0. Otherwise, y is unobserved.

x[f + Si, if z* > 0; unobserved, if z* < 0.

 l;if z* >0, 0, otherwise.  Actually, we do not observe y. All we observe is a dichotomous variable z with the value of 1 if z* variable exceeds a particular threshold (put it £ = 0) and 0 otherwise.

As a result, we have a basic selection equation (10) and basic outcome equation (9). In the literature it is called the Heckman model, sample selection model, tobit II, or heckit model. In a special case, when yi = zi we have the tobit model. Typically, we also make the following assumption about the distribution of, and relationship between, the error terms in (9) and (10) equations:

 ", – N (0, a2) (11) u, – N (0, 1) (12) СОГГ (",, u,) = P"u (13)

 E (y, |ji = is observe^ = E (y, |z* > 0) = E (y, |z(- = 1) = w=a + u, > 0    In other words, Eqs. (11)(13) show that we assume a bivariate normal distri­bution with zero means and correlation p£u. The sample selection bias problem arises when estimating " in the Eq. (7) if error terms £ and u; are correlated. The conditional mean equals

If the errors terms £ and u; are independent, then the last term in (14) simplifies to E[£] = 0 means £ equals 0 and OLS regression of y on x in (7) will give consistent estimates of ". However, any correlation between the two errors means that we need to obtain E["iui > — w/a] when £ and u; are correlated.     Using derivations similar to the tobit model, Greene (2003) noted that

where

X;—Heckman’s X or the inverse Mill’s ratio.   Thus, the conditional mean in the Heckman model is:

The widely used way to estimate the Heckman model (7)( 13) is Heckman’s two-step procedure. It involves first estimating the probit model in Eq. (10) and computing Heckman’s X (18)

where

ф —the standard normal density,

Ф W=(x —the cumulative density function. and then estimating the regression of y on x and A. The coefficient on the X indicates if there is sample selection bias. As a result, estimators are consistent and asymptotically normal. Alternatively, an MLE version is used to estimate a Heckman model. However, this procedure is less robust than the two-step procedure and it is sometimes difficult to get it to converge, but it will be more efficient (Wooldridge 2002).

Technically, the Heckman model is identified when the same independent vari­ables in the selection equation (10) appear in the outcome equation (9). However, to avoid issues with identification (multicollinearity and imprecise estimates), we nearly always want at least one independent variable that appears in the selection equation but does not appear in the outcome equation (i. e., we need a variable that affects selection, but not the outcome).

The example of using a Heckman model could be the case when we model different parameters of the credit contract. However, we observe them only for clients who receive approval. It means that the selection equation is a decision of the underwriting process, and the outcome equation is a parameter of the credit contract—like the loan-to-value ratio or the loan amount, the maturity, or the contract rate.

The Heckman model is useful when the outcome equation involves a continuous dependent variable. However, when we are interested in a case where the outcome
equation involves a dichotomous dependent variable, a bivariate probit model with selection (BVP with sample selection) or double probit model is used.

 yf = Xi^i + "i (19) У2* = X2^2 + "2 (20) ( i, if yf >0, 0, if yf < 0. (21) ( i, if yf >0, = 0, if y2f < 0. (22)  In a bivariate probit model, we have two separate probit models with correlated disturbances " and "2. We typically assume that the errors are independent and identically distributed as a standard bivariate normal with correlation p. cott ("i ,£2) = p

If the errors between the two probit models are independent of one other i. e. p = 0, then we can just estimate the two probit models separately. In this case, it is possible to obtain consistent results. However, when p ф 0, it is more efficient to estimate equations jointly. To do that, we are interested in joint probability of y1 and y2 in (21) and (22).

Pr (yu = i) = Pr ("ii > – Xi^i) (26)

Pr (У2І = i) = Pr ("2i > – X2^2) (27)

If two random variables are independent, then their joint probability is just the product of their marginal probabilities. The problem in our situation is that the two probabilities are not independent; we need to calculate joint probabilities for non-independent events. For this reason, we need to assume some joint distribution of y1 and y2. Usually, we use a bivariate normal distribution.

To estimate the bivariate probit model, (19)(25) MLE is used. It can sometimes difficult to get the model to converge. A likelihood ratio test is used to test the hypothesis that the bivariate model fits data better than the separate probits.

By adding two more conditions

= x. A + ,J V* = 1, (28)

unobserved, otherwise.

y* is observed for all classes (29)

we receive the BVP model with sample selection, which is an extended version of the classical Heckman model. Similarly, the Heckman’s two-step procedure is used to estimate this model.

In the first step (20), a simple binary probit or logit model of mortgage approval (the participation/selection equation) is estimated by using the full sample of approved and declined applicants. Obtained estimators are used to calculate the consistent parameter estimates of Heckman’s X (18) for every observation in the full sample.

The second step includes estimation of a simple binary probit or logit model for the probability of default (19) (the outcome equation) by using the calculated Heckman’s X. This two-stage approach corrects the sample selection bias and provides consistent and computationally efficient probabilities of default. The magnitude of bias is dependent both on the correlation between random error terms in the systems of equations, and Heckman’s X.