# 统计代写|线性回归分析代写linear regression analysis代考|STAT2220

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础
## 统计代写|线性回归分析代写linear regression analysis代考|The Wald t Test

Often investigators hope to examine $\beta_k$ in order to determine the importance of the predictor $x_k$ in the model; however, $\beta_k$ is the coefficient for $x_k$ given that the other predictors are in the model. Hence $\beta_k$ depends strongly on the other predictors in the model. Suppose that the model has an intercept:

$x_1 \equiv 1$. The predictor $x_k$ is highly correlated with the other predictors if the OLS regression of $x_k$ on $x_1, \ldots, x_{k-1}, x_{k+1}, \ldots, x_p$ has a high coefficient of determination $R_k^2$. If this is the case, then often $x_k$ is not needed in the model given that the other predictors are in the model. If at least one $R_k^2$ is high for $k \geq 2$, then there is multicollinearity among the predictors.

As an example, suppose that $Y=$ height, $x_1 \equiv 1, x_2=$ left leg length, and $x_3=$ right leg length. Then $x_2$ should not be needed given $x_3$ is in the model and $\beta_2=0$ is reasonable. Similarly $\beta_3=0$ is reasonable. On the other hand, if the model only contains $x_1$ and $x_2$, then $x_2$ is extremely important with $\beta_2$ near 2. If the model contains $x_1, x_2, x_3, x_4=$ height at shoulder, $x_5=$ right arm length, $x_6=$ head length, and $x_7=$ length of back, then $R_i^2$ may be high for each $i \geq 2$. Hence $x_i$ is not needed in the MLR model for $Y$ given that the other predictors are in the model.

Definition 2.23. The $100(1-\delta) \%$ CI for $\beta_k$ is $\hat{\beta}k \pm t{n-p, 1-\delta / 2} \operatorname{se}\left(\hat{\beta}k\right)$. If the degrees of freedom $d=n-p \geq 30$, the $\mathrm{N}(0,1)$ cutoff $z{1-\delta / 2}$ may be used.
Know how to do the 4 step Wald $t$-test of hypotheses.
i) State the hypotheses Ho: $\beta_k=0 \mathrm{Ha}: \beta_k \neq 0$.
ii) Find the test statistic $t_{o, k}=\hat{\beta}k / \operatorname{se}\left(\hat{\beta}_k\right)$ or obtain it from output. iii) Find pval from output or use the $t$-table: pval = $$2 P\left(t{n-p}<-\left|t_{o, k}\right|\right)=2 P\left(t_{n-p}>\left|t_{o, k}\right|\right)$$
Use the normal table or the $d=Z$ line in the $t$-table if the degrees of freedom $d=n-p \geq 30$. Again pval is the estimated p-value.
iv) State whether you reject Ho or fail to reject Ho and give a nontechnical sentence restating your conclusion in terms of the story problem.

## 统计代写|线性回归分析代写linear regression analysis代考|Two Important Special Cases

When studying a statistical model, it is often useful to try to understand the model that contains a constant but no nontrivial predictors, then try to understand the model with a constant and one nontrivial predictor, then the model with a constant and two nontrivial predictors, and then the general model with many predictors. In this text, most of the models are such that $Y$ is independent of $\boldsymbol{x}$ given $\boldsymbol{x}^T \boldsymbol{\beta}$, written
$$Y \Perp \boldsymbol{x} \mid \boldsymbol{x}^T \boldsymbol{\beta}$$
Then $w_i=\boldsymbol{x}_i^T \hat{\beta}$ is a scalar, and trying to understand the model in terms of $\boldsymbol{x}_i^T \hat{\boldsymbol{\beta}}$ is about as easy as trying to understand the model in terms of one nontrivial predictor. In particular, the response plot of $\boldsymbol{x}_i^T \hat{\boldsymbol{\beta}}$ versus $Y_i$ is essential.

For MLR, the two main benefits of studying the MLR model with one nontrivial predictor $X$ are that the data can be plotted in a scatterplot of $X_i$ versus $Y_i$ and that the OLS estimators can be computed by hand with the aid of a calculator if $n$ is small.

The location model
$$Y_i=\mu+e_i, \quad i=1, \ldots, n$$
is a special case of the multiple linear regression model where $p=1, \boldsymbol{X}=\mathbf{1}$, and $\boldsymbol{\beta}=\beta_1=\mu$. This model contains a constant but no nontrivial predictors.
In the location model, $\hat{\boldsymbol{\beta}}_{O L S}=\hat{\beta}_1=\hat{\mu}=\bar{Y}$. To see this, notice that $$Q_{O L S}(\eta)=\sum_{i=1}^n\left(Y_i-\eta\right)^2 \text { and } \frac{d Q_{O L S}(\eta)}{d \eta}=-2 \sum_{i=1}^n\left(Y_i-\eta\right)$$
Setting the derivative equal to 0 and calling the solution $\hat{\mu}$ gives $\sum_{i=1}^n Y_i=n \hat{\mu}$ or $\hat{\mu}=\bar{Y}$. The second derivative
$$\frac{d^2 Q_{O L S}(\eta)}{d \eta^2}=2 n>0,$$
hence $\hat{\mu}$ is the global minimizer.

# 线性回归分析代写

## 统计代写|线性回归分析代写linear regression analysis代考|The Wald t Test

$x_1 \equiv 1$ 。对的 OLS 回归具有高决定系数，则预测与其 他预测变量高度相关。如果是这种情况，那么模型中通常 不需要至少有高，则预测变量之间存在多重共线性。 $x_k$ $x_k x_1, \ldots, x_{k-1}, x_{k+1}, \ldots, x_p R_k^2 x_k R_k^2 k \geq 2$

Ho:。ii) 找到检验统计量获取。iii) 从输出中查找 pval 或 使用表: $\mathrm{pval}=100(1-\delta) \% \beta_k$
\begin{aligned} & \hat{\beta} k \pm t n-p, 1-\delta / 2 \operatorname{se}(\hat{\beta} k) d=n-p \geq 30 \ & \mathrm{~N}(0,1) z 1-\delta / 2 \ & t \ & \beta_k=0 \mathrm{Ha}: \beta_k \neq 0 \ & t_{o, k}=\hat{\beta} k / \operatorname{se}\left(\hat{\beta}k\right) t \ & 2 P\left(t n-p<-\left|t{o, k}\right|\right)=2 P\left(t_{n-p}>\left|t_{o, k}\right|\right) \end{aligned}

$$d=Z t d=n-p \geq 30$$

## 统计代写|线性回归分析代写linear regression analysis代考|Two Important Special Cases

$$\boldsymbol{Y} \backslash \operatorname{Perp} \boldsymbol{x} \mid \boldsymbol{x}^T \boldsymbol{\beta}$$
$w_i=\boldsymbol{x}i^T \hat{\beta} \boldsymbol{x}_i^T \hat{\boldsymbol{\beta}}$ 就像尝试根据一个非平凡的预测变量来 理解模型一样简单。特别是，与的响应图是必不可少的。 $\boldsymbol{x}_i^T \hat{\boldsymbol{\beta}} Y_i$ 的 MLR 模型的两个主要好处是，可以将数据绘制在与 是，则可以借助计算器手动计算 OLS 估计量小的。 $X X_i$ $Y_i n$ 位置模型是多元线性回归模型的特例，其中和。该模型包, 含一个常量但没有非平凡的预测变量。在位置模型中，。 要看到这一点，请注意 hat得到或 $$Y_i=\mu+e_i, \quad i=1, \ldots, n$$ \begin{aligned} & p=1, \boldsymbol{X}=\mathbf{1} \boldsymbol{\beta}=\beta_1=\mu \ & \hat{\boldsymbol{\beta}}{O L S}=\hat{\beta}1=\hat{\mu}=\bar{Y} \ & Q{O L S}(\eta)=\sum_{i=1}^n\left(Y_i-\eta\right)^2 \text { and } \frac{d Q_{O L S}(\eta)}{d \eta}=-2 \sum_{i=1}^n \end{aligned}$\hat{\mu} \sum_{i=1}^n Y_i=n \hat{\mu} \hat{\mu}=\bar{Y} \ldots \ldots$ 二阶导数因此是全局最小 化器。
$$\frac{d^2 Q_{O L S}(\eta)}{d \eta^2}=2 n>0$$

