## 经济代写|计量经济学代写Econometrics代考|The plug-in solution in the omitted variable bias

2023年4月13日
## 经济代写|计量经济学代写Econometrics代考|The plug-in solution in the omitted variable bias

Sometimes omitted variable bias occurs because a key variable that affects $Y$ is not available. For example, consider a model where the monthly salary of an individual is associated with whether the person is male or female (sex) and the years each individual has spent in education (education). Both these factors can be quantified easily and included in the model. However, if we also assume that the salary level can be affected by the socio-economic environment in which each person was raised, then it is difficult to find a variable that captures that aspect:
$$(\text { salary_level })=\beta_1+\beta_2(\text { sex })+\beta_3(\text { education })+\beta_4(\text { background })$$
Not including the background variable in this model may lead to biased and inconsistent estimates of $\beta_2$ and $\beta_3$. Our major interest, however, is to obtain appropriate estimates for those two slope coefficients. We do not care that much about $\beta_1$, and we can never hope for a consistent estimator of $\beta_4$, since background is unobserved. Therefore, a way to resolve this problem and obtain appropriate slope coefficients is to include a proxy variable for the omitted variable, such as, in this example, the family income ( $f m _i n c$ ) of each individual. In this case, of course, fm_inc does not have to be the same as background, but we need fim_inc to be correlated with the unobserved variable background.
To illustrate this in more detail, consider the following model:
$$Y=\beta_1+\beta_2 X_2+\beta_3 X_3+\beta_4 X_4^*+u$$

where $X_2$ and $X_3$ are variables that are observed (such as sex and education), while $X_4^$ is unobserved (such as background), but we have a variable $X_4$ that is a ‘good’ proxy variable for $X_4^$ (such as $f m_{-} i n c$ ).

For $X_4$ we require at least some relationship to $X_4^$; for example, a simple linear form such as: $$X_4^=\gamma_1+\gamma_2 X_4+e$$
where an error $e$ should be included because $X_4^$ and $X_4$ are not exactly related. Obviously, if then the variable $X_4^$ is not an appropriate proxy for $X_4$, while in general we include proxies that have a positive correlation, so, $\gamma_2>0$. The coefficient $\gamma_1$ is included in order to allow $X_4^*$ and $X_4$ to be measured on different scales, and obviously they can be related either positively or negatively.

## 经济代写|计量经济学代写Econometrics代考|Various functional forms

A different situation where specification errors may be found occurs when an incorrect functional form is used. The most obvious case relates to the basic assumption of having an equation that can be represented by a linear relationship. If this is not true, then a linear estimating equation might be adopted while the real population relationship is non-linear.
For example, if the true regression equation is:
$$Y=A X_2^\beta X_3^\gamma e^u$$
and we estimate the linear form given by:
$$Y=a+\beta X_2+\gamma X_3+u$$
then the parameters $\beta$ and $\gamma$ in the non-linear model represent elasticities, while $\beta$ (and $\gamma$ ) in the linear model show an estimate of the change in $Y$ after a one-unit change in $X_2$ (and $X_3$ ). Therefore, $\beta$ and $\gamma$ are clearly incorrect estimators of the true population parameters.

One way to detect incorrect functional forms is to visually inspect the pattern of the residuals. If a systematic pattern is observed in the residuals we may suspect the possibility of misspecification. However, it is also useful to know the various possible non-linear functional forms that might have to be estimated, together with the properties regarding marginal effects and elasticities. Table 8.1 presents a summary of the forms and features of the various alternative models.

