## 经济代写|计量经济学代写Econometrics代考|Maximum Likelihood and Generalized Least Squares

Up to this point, we have assumed that the errors adhering to regression models are independently distributed with constant variance. This is a strong assumption, which is often untenable in practice. In this chapter, we consider estimation techniques that allow it to be relaxed. These are generalized least squares, or GLS, and generalized nonlinear least squares, or GNLS, on the one hand, and various applications of the method of maximum likelihood on the other. We treat GLS and ML together because, when ML is applied to regression models with normal errors, the estimators that result are very closely related to GLS estimators.

The plan of the chapter is as follows. First of all, in Section 9.2, we relax the assumption that the error terms are independently distributed with constant variance. ML estimation of regression models without those assumptions turns out to be conceptually straightforward and to be closely related to the method of GNLS. In Section 9.3, we discuss the geometry of GLS and consider an important special case in which OLS and GLS estimates are identical. In Section 9.4, we show how a version of the Gauss-Newton regression may be used with models estimated by GNLS. In Section 9.5, we show how GNLS is related to feasible GNLS and discuss a number of fundamental results about both GNLS and feasible GNLS. The relationship between GNLS and ML is then treated in Section 9.6. In Sections $9.7$ through 9.9, we consider multivariate nonlinear regression models. Although such models may often seem very complicated, primarily because of the notational complexities of allowing for several jointly dependent variables, we show that they are actually quite straightforward to estimate by means of GNLS or ML. Finally, in Section 9.10, we discuss models for dealing with panel data and other data sets that combine time series and cross sections. In this chapter, we do not discuss plied work, namely, the estimation of regression models with serial correlation. The enormous literature on this subject will be the topic of Chapter 10.

## 经济代写|计量经济学代写Econometrics代考|Generalized Least Squares

In this section, we will consider the class of models
$$\boldsymbol{y}=\boldsymbol{x}(\boldsymbol{\beta})+\boldsymbol{u}, \quad \boldsymbol{u} \sim N(\mathbf{0}, \boldsymbol{\Omega}),$$
where $\Omega$, an $n \times n$ positive definite matrix, is the covariance matrix of the vector of error terms $\boldsymbol{u}$. The normality assumption can of course be relaxed, but we retain it for now since we want to use the method of maximum likelihood. In some applications the matrix $\Omega$ may be known. In others it may be known only up to a multiplicative constant, which implies that we can write $\boldsymbol{\Omega}=\sigma^2 \boldsymbol{\Delta}$, with $\boldsymbol{\Delta}$ a known $n \times n$ matrix and $\sigma^2$ an unknown positive scalar. In most applications, only the structure of $\Omega$ will be known; one might know for example that it arises from a particular pattern of heteroskedasticity or serial correlation and hence depends on a certain number of parameters in a certain way. We will consider all three cases.
The density of the vector $\boldsymbol{u}$ is the multivariate normal density
$$f(\boldsymbol{u})=(2 \pi)^{-n / 2}|\boldsymbol{\Omega}|^{-1 / 2} \exp \left(-\frac{1}{2} \boldsymbol{u}^{\top} \boldsymbol{\Omega}^{-1} \boldsymbol{u}\right) .$$
In order to pass from the density of the vector of error terms $\boldsymbol{u}$ to that of the vector of dependent variables $\boldsymbol{y}$, we must first replace $\boldsymbol{u}$ by $\boldsymbol{y}-\boldsymbol{x}(\boldsymbol{\beta})$ in (9.02) and then multiply by the absolute value of the determinant of the Jacobian matrix associated with the transformation that expresses $\boldsymbol{u}$ in terms of $\boldsymbol{y}$. This use of a Jacobian factor is analogous to what we did in Section $8.10$ with scalar random variables: For details, see Appendix B. In this case, the Jacobian matrix is the identity matrix, and so the determinant is unity. Hence the likelihood function is
$$L^n(\boldsymbol{y}, \boldsymbol{\beta}, \boldsymbol{\Omega})=(2 \pi)^{-n / 2}|\boldsymbol{\Omega}|^{-1 / 2} \exp \left(-\frac{1}{2}(\boldsymbol{y}-\boldsymbol{x}(\boldsymbol{\beta}))^{\top} \boldsymbol{\Omega}^{-1}(\boldsymbol{y}-\boldsymbol{x}(\boldsymbol{\beta}))\right),$$
and the loglikelihood function is
$$\ell^n(\boldsymbol{y}, \boldsymbol{\beta}, \boldsymbol{\Omega})=-\frac{n}{2} \log (2 \pi)-\frac{1}{2} \log |\boldsymbol{\Omega}|-\frac{1}{2}(\boldsymbol{y}-\boldsymbol{x}(\boldsymbol{\beta}))^{\top} \boldsymbol{\Omega}^{-1}(\boldsymbol{y}-\boldsymbol{x}(\boldsymbol{\beta})) .(9.03)$$
If the matrix $\Omega \Omega$ is known, it is clear that this function can be maximized by minimizing the generalized sum of squared residuals
$$\operatorname{SSR}(\boldsymbol{\beta} \mid \boldsymbol{\Omega})=(\boldsymbol{y}-\boldsymbol{x}(\boldsymbol{\beta}))^{\top} \boldsymbol{\Omega}^{-1}(\boldsymbol{y}-\boldsymbol{x}(\boldsymbol{\beta})) .$$

# 计量经济学代考

## 经济代写|计量经济学代写Econometrics代考|广义最小二乘

$$\boldsymbol{y}=\boldsymbol{x}(\boldsymbol{\beta})+\boldsymbol{u}, \quad \boldsymbol{u} \sim N(\mathbf{0}, \boldsymbol{\Omega}),$$
，其中$\Omega$，一个$n \times n$正定矩阵，是误差项$\boldsymbol{u}$的向量的协方差矩阵。正规性假设当然可以放松，但我们现在保留它因为我们想使用最大似然的方法。在某些应用中，矩阵$\Omega$可能是已知的。在其他情况下，它可能只知道一个乘法常数，这意味着我们可以写$\boldsymbol{\Omega}=\sigma^2 \boldsymbol{\Delta}$, $\boldsymbol{\Delta}$是一个已知的$n \times n$矩阵，$\sigma^2$是一个未知的正标量。在大多数应用程序中，只有$\Omega$的结构是已知的;例如，人们可能知道它来自异方差或序列相关的一种特定模式，因此以某种方式依赖于一定数量的参数。我们将考虑所有三种情况。

$$f(\boldsymbol{u})=(2 \pi)^{-n / 2}|\boldsymbol{\Omega}|^{-1 / 2} \exp \left(-\frac{1}{2} \boldsymbol{u}^{\top} \boldsymbol{\Omega}^{-1} \boldsymbol{u}\right) .$$

$$L^n(\boldsymbol{y}, \boldsymbol{\beta}, \boldsymbol{\Omega})=(2 \pi)^{-n / 2}|\boldsymbol{\Omega}|^{-1 / 2} \exp \left(-\frac{1}{2}(\boldsymbol{y}-\boldsymbol{x}(\boldsymbol{\beta}))^{\top} \boldsymbol{\Omega}^{-1}(\boldsymbol{y}-\boldsymbol{x}(\boldsymbol{\beta}))\right),$$
，对数似然函数为
$$\ell^n(\boldsymbol{y}, \boldsymbol{\beta}, \boldsymbol{\Omega})=-\frac{n}{2} \log (2 \pi)-\frac{1}{2} \log |\boldsymbol{\Omega}|-\frac{1}{2}(\boldsymbol{y}-\boldsymbol{x}(\boldsymbol{\beta}))^{\top} \boldsymbol{\Omega}^{-1}(\boldsymbol{y}-\boldsymbol{x}(\boldsymbol{\beta})) .(9.03)$$

$$\operatorname{SSR}(\boldsymbol{\beta} \mid \boldsymbol{\Omega})=(\boldsymbol{y}-\boldsymbol{x}(\boldsymbol{\beta}))^{\top} \boldsymbol{\Omega}^{-1}(\boldsymbol{y}-\boldsymbol{x}(\boldsymbol{\beta})) .$$ 来最大化

