## 统计代写|线性回归分析代写linear regression analysis代考|THE MULTIPLE LINEAR REGRESSION MODEL

The general multiple linear regression model with response $Y$ and regressors $X_1, \ldots, X_p$ will have the form
$$\mathrm{E}(Y \mid X)=\beta_0+\beta_1 X_1+\cdots+\beta_p X_p$$
The symbol $X$ in $\mathrm{E}(Y \mid X)$ means that we are conditioning on all the regressors on the right side of the equation. When we are conditioning on specific values for the predictors $x_1, \ldots, x_p$ that we will collectively call $\mathbf{x}$, we write
$$\mathrm{E}(Y \mid X=\mathbf{x})=\beta_0+\beta_1 x_1+\cdots+\beta_p x_p$$
As in Chapter 2, the $\beta \mathrm{s}$ are unknown parameters to be estimated. When $p=1$, $X$ has only one element, and we get the simple regression problem discussed in Chapter 2. When $p=2$, the mean function (3.3) corresponds to a plane in 3 dimensions. When $p>2$, the fitted mean function is a hyperplane, the generalization of a $p$-dimensional plane in a $(p+1)$-dimensional space. We cannot draw a general $p$-dimensional plane in our three-dimensional world.

## 统计代写|线性回归分析代写linear regression analysis代考|PREDICTORS AND REGRESSORS

Regression problems start with a collection of potential predictors. Some of these may be continuous measurements, like the height or weight of an object. Some may be discrete but ordered, like a doctor’s rating of overall health of a patient on a nine-point scale. Other potential predictors can be categorical, like eye color or an indicator of whether a particular unit received a treatment.

All these types of potential predictors can be useful in multiple linear regression.

From the pool of potential predictors, we create a set of regressors $^2$ that are the $X$-variables that appear in (3.3). The regressors might include

The intercept Suppose we define $\mathbf{1}$ to be a regressor that is always equal to 1 . The mean function (3.3) can be rewritten as
$$\mathrm{E}(Y \mid X)=\beta_0 \mathbf{1}+\beta_1 X_1+\cdots+\beta_p X_p$$
Mean functions without an intercept would not have this regressor included. In most computer programs, an intercept is included unless it is specifically suppressed.
Predictors The simplest type of regressor is equal to a predictor, for example, the variable mheight in the heights data or fertility in the UN data.
Transformations of predictors Sometimes the original predictors need to be transformed in some way to make (3.3) hold to a reasonable approximation. This was the case in the UN data in which ppgdp was used in $\log$ scale. The willingness to replace predictors by transformations of them greatly expands the range of problems that can be summarized with a linear regression model.
Polynomials Problems with curved mean functions can sometimes be accommodated in the multiple linear regression model by including polynomial regressors in the predictor variables. For example, we might include as regressors both a predictor $X_1$ and its square $X_1^2$ to fit a quadratic polynomial in that predictor. Complex polynomial surfaces in several predictors can be useful in some problems, as will be discussed in Section 5.3. ${ }^3$
Interactions and other combinations of predictors Combining several predictors is often useful. An example of this is using body mass index, given by weight in kilograms divided by height in meters squared, in place of both height and weight, or using a total test score in place of the separate scores from each of several parts. Products of regressors called interactions are often included in a mean function along with the base regressors to allow for joint effects
Dummy variables and factors A categorical predictor with two or more levels is called a factor. Factors are included in multiple linear regression using dummy variables, which are typically regressors that have only two values, often 0 and 1 , indicating which category is present for a particular observation. We will see in Chapter 5 that a categorical predictor with two categories can be represented by one dummy variable, while a categorical predictor with many categories can require several dummy variables.
Regression splines Polynomials represent the effect of a predictor by using a sum of regressors, like $\beta_1 x+\beta_2 x^2+\beta_3 x^3$. We can view this as a linear combination of basis functions, given in the polynomial case by the functions $\left{x, x^2, x^3\right}$. Using splines is similar to fitting a polynomial, except we use different basis functions that can have useful properties under some circumstances. We return to the use of splines in Section 5.4.
Principal components In some problems we may have a large number of predictors that are thought to be related. For example, we could have predictors that correspond to the amount of a particular drug that is present in repeated samples on the same subject. Suppose $X_1, \ldots, X_m$ are $m$ such predictors. For clarity, we may wish to replace these $m$ predictors by a single regressor $Z=\sum a_j X_j$ where $Z$ summarizes the information in the multiple indicators as fully as possible. One way to do this is to set all the $a_j=1 / m$, and then $Z$ is just the average of the $X_j$. Alternatively, the $a_j$ can be found that satisfy some criterion, such as maximizing the variance of $Z$.

# 线性回归代写

## 统计代写|线性回归分析代写linear regression analysis代考|THE MULTIPLE LINEAR REGRESSION MODEL

$$\mathrm{E}(Y \mid X)=\beta_0+\beta_1 X_1+\cdots+\beta_p X_p$$
$\mathrm{E}(Y \mid X)$中的符号$X$意味着我们对等式右侧的所有回归量进行条件反射。当我们对我们统称为$\mathbf{x}$的预测因子$x_1, \ldots, x_p$的特定值进行条件反射时，我们会这样写
$$\mathrm{E}(Y \mid X=\mathbf{x})=\beta_0+\beta_1 x_1+\cdots+\beta_p x_p$$

