R是一种用于统计计算和图形的编程语言，由R核心团队和R统计计算基金会支持。R由统计学家Ross Ihaka和Robert Gentleman创建，在数据挖掘者和统计学家中被用于数据分析和开发统计软件。用户已经创建了软件包来增强R语言的功能。

## 统计代写|R语言代写R language代考|Covariance

As always, visualizations are great-necessary, even-but on most occasions, we are going to quantify these correlations and summarize them with numbers.
The simplest measure of correlation that is widely use is the covariance. For each pair of values from the two variables, the differences from their respective means are taken. Then, those values are multiplied. If they are both positive (that is, both the values are above their respective means), then the product will be positive too. If both the values are below their respective means, the product is still positive, because the product of two negative numbers is positive. Only when one of the values is above its mean will the product be negative.
$$\operatorname{cov}_{x y}=\frac{\sum(x-\bar{x})(y-\bar{y})}{(n-1)}$$

Remember, in sample statistics we divide by the degrees of freedom and not the sample size. Note that this means that the covariance is only defined for two vectors that have the same length.

We can find the covariance between two variables in R using the cov function. Let’s find the covariance between the heights and weights in the dataset, women:
$>\operatorname{cov}$ (women\$weight, women\$height)
[1] 69
# the order we put the two columns in
# the arguments doesn’t matter
$>\operatorname{cov}$ (women\$height, women\$weight)
[1] 69
The covariance is positive, which denotes a positive relationship between the two variables.

The covariance, by itself, is difficult to interpret. It is especially difficult to interpret in this case, because the measurements use different scales: inches and pounds. It is also heavily dependent on the variability in each variable.

Consider what happens when we take the covariance of the weights in pounds and the heights in centimeters.
# there are $2.54$ centimeters in each inch
# changing the units to centimeters increases
# the variability within the height variable
$>\operatorname{cov}$ (women\$height*2.54, women\$weight)
[1] $175.26$
Semantically speaking, the relationship hasn’t changed, so why should the covariance?

## 统计代写|R语言代写R language代考|Correlation coefficients

A solution to this quirk of covariance is to use Pearson’s correlation coefficient instead. Outside its colloquial context, when the word correlation is uttered-especially by analysts, statisticians, or scientists – it usually refers to Pearson’s correlation.
Pearson’s correlation coefficient is different from covariance in that instead of using the sum of the products of the deviations from the mean in the numerator, it uses the sum of the products of the number of standard deviations away from the mean. These number-of-standard-deviations-from-the-mean are called z-scores. If a value has a z-score of $1.5$, it is $1.5$ standard deviations above the mean; if a value has a z-score of -2, then it is 2 standard deviations below the mean.

Pearson’s correlation coefficient is usually denoted by $\mathrm{r}$ and its equation is given as follows:
$$r=\frac{\sum(x-\bar{x})(y-\bar{y})}{(n-1) s_x s_y}$$
which is the covariance divided by the product of the two variables’ standard deviation.
An important consequence of using standardized z-scores instead of the magnitude of distance from the mean is that changing the variability in one variable does not change the correlation coefficient. Now you can meaningfully compare values using two different scales or even two different distributions. The correlation between weight/height-in-inches and weight/height-in-centimeters will now be identical, because multiplication with $2.54$ will not change the z-scores of each height.

cor (women\$height, women\$weight)
[1] $0.9954948$
$>$ cor (women\$height$\star 2.54$, women\$weight)
[1] $0.9954948$
Another important and helpful consequence of this standardization is that the measure of correlation will always range from $-1$ to 1 . A Pearson correlation coefficient of 1 will denote a perfectly positive (linear) relationship, a $\mathrm{r}$ of $-1$ will denote a perfectly negative (linear) relationship, and a $\mathrm{r}$ of 0 will denote no (linear) relationship.
Why the linear qualification in parentheses, though?

## 统计代写|R语言代写R language代考|Covariance

$$\operatorname{cov}_{x y}=\frac{\sum(x-\bar{x})(y-\bar{y})}{(n-1)}$$

$>\operatorname{cov}$ (women $\$$weight, women \$$ height) [1] 69 # 我们将两列放入的顺序 # 参数无关紧要$>\operatorname{cov}$(women$\$$height, women \$$ weight)
[1] 69

# 有 $2.54$ 每英寸

# 高度变量内的可变性

