## 经济代写|计量经济学代写Econometrics代考|Maximum Likelihood and GNLS

A second approach that is widely used in place of feasible GLS when $\boldsymbol{\Omega}$ is assumed to equal $\Omega(\alpha)$ with $\alpha$ unknown is the method of maximum likelihood. To use it we must make some assumption about the distribution of the error terms (in practice, almost always an assumption of normality). This allows us to write down the appropriate loglikelihood function as a function of the $q$-vector $\boldsymbol{\alpha}$ and the $k$-vector $\boldsymbol{\beta}$.

Consider the class of models
$$\boldsymbol{y}=\boldsymbol{x}(\boldsymbol{\beta})+\boldsymbol{u}, \quad \boldsymbol{u} \sim N(\mathbf{0}, \boldsymbol{\Omega}(\boldsymbol{\alpha})) .$$
By modifying the loglikelihood function (9.03) slightly, we find that the loglikelihood function corresponding to (9.31) is
\begin{aligned} \ell^n(\boldsymbol{y}, \boldsymbol{\beta}, \boldsymbol{\alpha})=&-\frac{n}{2} \log (2 \pi)-\frac{1}{2} \log |\boldsymbol{\Omega}(\boldsymbol{\alpha})| \ &-\frac{1}{2}(\boldsymbol{y}-\boldsymbol{x}(\boldsymbol{\beta}))^{\top} \boldsymbol{\Omega}^{-1}(\boldsymbol{\alpha})(\boldsymbol{y}-\boldsymbol{x}(\boldsymbol{\beta})) . \end{aligned}
There will be two sets of first-order conditions, one for $\boldsymbol{\alpha}$ and one for $\boldsymbol{\beta}$. The latter will be similar to the first-order conditions (9.05) for GNLS:
$$\boldsymbol{X}^{\top}(\hat{\boldsymbol{\beta}}) \boldsymbol{\Omega}^{-1}(\hat{\boldsymbol{\alpha}})(\boldsymbol{y}-\boldsymbol{x}(\hat{\boldsymbol{\beta}}))=\mathbf{0} .$$
The former will be rather complicated and will depend on precisely how $\Omega$ is related to $\boldsymbol{\alpha}$. For a more detailed treatment, see Magnus (1978).

In Section 8.10, we saw that the information matrix for $\boldsymbol{\beta}$ and $\sigma$ in a nonlinear regression model with covariance matrix $\sigma^2 \mathbf{I}$ is block-diagonal between $\boldsymbol{\beta}$ and $\sigma$. An analogous result turns out to be true for the model (9.31) as well: The information matrix is block-diagonal between $\boldsymbol{\beta}$ and $\boldsymbol{\alpha}$. This means that, asymptotically, the vectors $n^{1 / 2}\left(\hat{\boldsymbol{\beta}}-\boldsymbol{\beta}_0\right)$ and $n^{1 / 2}\left(\hat{\boldsymbol{\alpha}}-\boldsymbol{\alpha}_0\right)$ are independent. Thus the fact that $\hat{\boldsymbol{\alpha}}$ is estimated jointly with $\hat{\boldsymbol{\beta}}$ can be ignored, and $\hat{\boldsymbol{\beta}}$ will have the same properties asymptotically as the GNLS estimator $\hat{\boldsymbol{\beta}}$ and the feasible GNLS estimator $\breve{\boldsymbol{\beta}}$.

The above argument does not require that the error terms $u_t$ actually be normally distributed. All that we require is that the vectors $n^{1 / 2}\left(\hat{\boldsymbol{\beta}}-\boldsymbol{\beta}_0\right)$ and $n^{1 / 2}\left(\hat{\boldsymbol{\alpha}}-\boldsymbol{\alpha}_0\right)$ be asymptotically independent and $O(1)$ under whatever DGP actually generated the data. It can be shown that this is in fact the case under fairly general conditions, similar to the conditions detailed in Chapter 5 for least squares to be consistent and asymptotically normal; see White (1982) and Gouriéroux, Monfort, and Trognon (1984) for fundamental results in this area. As we saw in Section 8.1, when the method of maximum likelihood is applied to a data set for which the DGP was not in fact a special case of the model being estimated, the resulting estimator is called a quasi-ML, or QML, estimator. In practice, of course, almost all the ML estimators we use are actually QML estimators, since some of the assumptions of our models are almost always wrong.

## 经济代写|计量经济学代写Econometrics代考|GLS Estimation of Multivariate Regression Models

In practice, multivariate regression models are usually estimated either by feasible GLS or by maximum likelihood, assuming normality. Except in very rare circumstances, it makes no sense to assume that $u_{t i}$ is independent of $u_{t j}$ for $i \neq j$, as we have already seen in the case of both seemingly unrelated regressions and demand systems. Depending on whether we intend to use ML or feasible GNLS, we may or may not want to assume that the vector of error terms $\boldsymbol{U}t$ is normally distributed. We will in either case make the assumption that $$\boldsymbol{U}_t \sim \operatorname{IID}(\mathbf{0}, \boldsymbol{\Sigma}),$$ where $\boldsymbol{\Sigma}$ is a (usually unknown) $m \times m$ covariance matrix, sometimes referred to as the contemporaneous covariance matrix. Thus we are assuming that $u{t i}$ is correlated with $u_{t j}$ but not with $u_{s j}$ for $s \neq t$. This is of course a strong assumption, which should be tested; we will discuss one test that may sometimes he appropriate helow. Under these assumptions, the generalized sum of squared residuals for the model (9.43) is
$$\sum_{t=1}^n\left(\boldsymbol{Y}_t-\boldsymbol{\xi}_t(\boldsymbol{\beta})\right) \boldsymbol{\Sigma}^{-1}\left(\boldsymbol{Y}_t-\boldsymbol{\xi}_t(\boldsymbol{\beta})\right)^{\top} .$$
Let us suppose initially that $\boldsymbol{\Sigma}$ is known. Then $\boldsymbol{\Sigma}$ may be used to transform the multivariate model (9.40) into a univariate one. Suppose that $\psi$ is an $m \times m$ matrix (usually triangular) such that
$$\boldsymbol{\psi} \boldsymbol{\psi}^{\top}=\boldsymbol{\Sigma}^{-1} .$$
If we postmultiply each term in (9.43) by $\boldsymbol{\psi}$, we obtain the regression
$$\boldsymbol{Y}_t \boldsymbol{\psi}=\boldsymbol{\xi}_t(\boldsymbol{\beta}) \boldsymbol{\psi}+\boldsymbol{U}_t \boldsymbol{\psi} .$$
The $1 \times m$ error vector $\boldsymbol{U}_t \boldsymbol{\psi}$ has covariance matrix
$$E\left(\boldsymbol{\psi}^{\top} \boldsymbol{U}_t^{\top} \boldsymbol{U}_t \boldsymbol{\psi}\right)=\boldsymbol{\psi}^{\top} \boldsymbol{\Sigma} \boldsymbol{\psi}=\mathbf{I}_m .$$
As written, (9.47) has only one observation, and all terms are $1 \times m$ véctors. In order to run this regression, we must somehow convert these $1 \times m$ vectors into $n m \times 1$ vectors for all observations together. There is more than one way to do this.

$$\boldsymbol{y}=\boldsymbol{x}(\boldsymbol{\beta})+\boldsymbol{u}, \quad \boldsymbol{u} \sim N(\mathbf{0}, \boldsymbol{\Omega}(\boldsymbol{\alpha})) .$$

\begin{aligned} \ell^n(\boldsymbol{y}, \boldsymbol{\beta}, \boldsymbol{\alpha})=&-\frac{n}{2} \log (2 \pi)-\frac{1}{2} \log |\boldsymbol{\Omega}(\boldsymbol{\alpha})| \ &-\frac{1}{2}(\boldsymbol{y}-\boldsymbol{x}(\boldsymbol{\beta}))^{\top} \boldsymbol{\Omega}^{-1}(\boldsymbol{\alpha})(\boldsymbol{y}-\boldsymbol{x}(\boldsymbol{\beta})) . \end{aligned}

$$\boldsymbol{X}^{\top}(\hat{\boldsymbol{\beta}}) \boldsymbol{\Omega}^{-1}(\hat{\boldsymbol{\alpha}})(\boldsymbol{y}-\boldsymbol{x}(\hat{\boldsymbol{\beta}}))=\mathbf{0} .$$

$$\sum_{t=1}^n\left(\boldsymbol{Y}_t-\boldsymbol{\xi}_t(\boldsymbol{\beta})\right) \boldsymbol{\Sigma}^{-1}\left(\boldsymbol{Y}_t-\boldsymbol{\xi}_t(\boldsymbol{\beta})\right)^{\top} .$$让我们最初假设 $\boldsymbol{\Sigma}$ 是已知的。然后 $\boldsymbol{\Sigma}$ 可以用来将多变量模型(9.40)转换为单变量模型。假设 $\psi$ 是一个 $m \times m$ 矩阵(通常为三角形)，使
$$\boldsymbol{\psi} \boldsymbol{\psi}^{\top}=\boldsymbol{\Sigma}^{-1} .$$

$$\boldsymbol{Y}_t \boldsymbol{\psi}=\boldsymbol{\xi}_t(\boldsymbol{\beta}) \boldsymbol{\psi}+\boldsymbol{U}_t \boldsymbol{\psi} .$$
The $1 \times m$ 误差矢量 $\boldsymbol{U}_t \boldsymbol{\psi}$ 协方差矩阵是否
$$E\left(\boldsymbol{\psi}^{\top} \boldsymbol{U}_t^{\top} \boldsymbol{U}_t \boldsymbol{\psi}\right)=\boldsymbol{\psi}^{\top} \boldsymbol{\Sigma} \boldsymbol{\psi}=\mathbf{I}_m .$$

