

## 统计代写|线性回归分析代写linear regression analysis代考|Prediction

The estimated mean function can be used to obtain values of the response for given values of the predictor. The two important variants of this problem are prediction and estimation of fitted values. Since prediction is more important, we discuss it first.

In prediction we have a new case, possibly a future value, not one used to estimate parameters, with observed value of the predictor $x_$. We would like to know the value $y_$, the corresponding response, but it has not yet been observed. If we assume that the data used to estimate the mean function are relevant to the new case, then the model fitted to the observed data can be used to predict for the new case. In the heights example, we would probably be willing to apply the fitted mean function to mother-daughter pairs alive in England at the end of the nineteenth century. Whether the prediction would be reasonable for mother-daughter pairs in other countries or in other time periods is much less clear. In Forbes’s problem, we would probably be willing to apply the results for altitudes in the range he studied. Given this additional assumption, a point prediction of $y_$, say $\tilde{y}$, is just
$$\tilde{y}=\hat{\beta}0+\hat{\beta}_1 x$$
$\tilde{y}$ predicts the as yet unobserved $y$. Assuming the model is correct, then the true value of $y_$ is $$y_=\beta_0+\beta_1 x_+e_$$
where $e_{\text {o }}$ is the random error attached to the future value, presumably with variance $\sigma^2$. Thus, even if $\beta_0$ and $\beta_1$ were known exactly, predictions would not match true values perfectly, but would be off by a random amount with standard deviation $\sigma$. In the more usual case where the coefficients are estimated, the prediction error variability will have a second component that arises from the uncertainty in the estimates of the coefficients. Combining these two sources of variation and using Appendix A.4,
$$\operatorname{Var}\left(\tilde{y}* \mid x\right)=\sigma^2+\sigma^2\left(\frac{1}{n}+\frac{\left(x_-\bar{x}\right)^2}{\operatorname{SXX}}\right)$$

## 统计代写|线性回归分析代写linear regression analysis代考|THE COEFFICIENT OF DETERMINATION, R

Ignoring all possible predictors, the best prediction of a response $y$ would simply be the sample average $\bar{y}$ of the values of the response observed in the data. The total sum of squares SYY $=\Sigma\left(y_i-\bar{y}\right)^2$ is the observed total variation of the response, ignoring any and all predictors. The total sum of squares is the sum of squared deviations from the horizontal line illustrated in Figure 2.4.
When we include a predictor, the unexplained variation is given by RSS, the sum of squared deviations from the fitted line, as shown on Figure 2.4. The difference between these sums of squares is called the sum of squares due to regression, SSreg, defined by
$$\text { SSreg }=\text { SYY }- \text { RSS }$$
We can get a computing formula for SSreg by substituting for RSS from (2.8),
$$\text { SSreg }=\text { SYY }-\left(\text { SYY }-\frac{(S Y Y)^2}{S X X}\right)=\frac{(S X Y)^2}{S X X}$$
If both sides of (2.18) are divided by SYY, we get
$$\frac{\text { SSreg }}{\text { SYY }}=1-\frac{\text { RSS }}{\text { SYY }}$$
The left-hand side of (2.20) is the proportion of observed variability in the response explained by regression on the predictor. The right-hand side consists of one minus the remaining unexplained variability. This concept of dividing up the total variability according to whether or not it is explained is of sufficient importance that a special name is given to it. We define $R^2$, the coefficient of determination, to be
$$R^2=\frac{\text { SSreg }}{\mathrm{SYY}}=1-\frac{\mathrm{RSS}}{\mathrm{SYY}}$$

# 线性回归代写

## 统计代写|线性回归分析代写linear regression analysis代考|Prediction

$$\tilde{y}=\hat{\beta}0+\hat{\beta}1 x$$ $\tilde{y}$预测了尚未观察到的$y$。假设模型正确，则$y$的真实值为$$y_=\beta_0+\beta_1 x_+e_$$

$$\operatorname{Var}\left(\tilde{y}* \mid x\right)=\sigma^2+\sigma^2\left(\frac{1}{n}+\frac{\left(x_-\bar{x}\right)^2}{\operatorname{SXX}}\right)$$

## 统计代写|线性回归分析代写linear regression analysis代考|THE COEFFICIENT OF DETERMINATION, R

$$\text { SSreg }=\text { SYY }- \text { RSS }$$

$$\text { SSreg }=\text { SYY }-\left(\text { SYY }-\frac{(S Y Y)^2}{S X X}\right)=\frac{(S X Y)^2}{S X X}$$
(2.18)的两边除以SYY，得到
$$\frac{\text { SSreg }}{\text { SYY }}=1-\frac{\text { RSS }}{\text { SYY }}$$
(2.20)的左侧是通过回归预测器解释的响应中观察到的可变性的比例。右边是1减去剩余的无法解释的变异性。根据是否得到解释来划分总变异性的概念非常重要，因此给它起了一个特殊的名称。我们定义决定系数$R^2$为
$$R^2=\frac{\text { SSreg }}{\mathrm{SYY}}=1-\frac{\mathrm{RSS}}{\mathrm{SYY}}$$

