## 统计代写|主成分分析代写Principal Component Analysis代考|PCA with Robustness to Missing Entries

Recall from Section $2.1 .2$ that in the PCA problem, we are given $N$ data points $\mathcal{X} \doteq\left{x_{j} \in \mathbb{R}^{D}\right}_{j=1}^{N}$ drawn (approximately) from a $d$-dimensional affine subspace $S \doteq{x=\mu+U y}$, where $\mu \in \mathbb{R}^{D}$ is an arbitrary point in $S, U \in \mathbb{R}^{D \times d}$ is a basis for $S$, and $\mathcal{Y}=\left{y_{j} \in \mathbb{R}^{d}\right}_{j=1}^{N}$ are the principal components.

In this section, we consider the PCA problem in the case that some of the given data points are incomplete. A data point $\boldsymbol{x}=\left[x_{1}, x_{2}, \ldots, x_{D}\right]^{\top}$ is said to be incomplete when some of its entries are missing or unspecified. For instance, if the $i$ th entry $x_{i}$, of $x$ is missing, then $x$ is known only up to a line in $\mathbb{R}^{D}$, i.e.,
\begin{aligned} \boldsymbol{x} \in L & \doteq\left{\left[x_{1}, \ldots, x_{i-1}, x_{i}, x_{i+1}, \ldots, x_{D}\right]^{\top}, x_{i} \in \mathbb{R}\right} \ &=\left{x_{-i}+x_{i} e_{i}, x_{i} \in \mathbb{R}\right} \end{aligned}
where $\boldsymbol{x}{-i}=\left[x{1}, \ldots, x_{i-1}, 0, x_{i+1}, \ldots, x_{D}\right]^{\top} \in \mathbb{R}^{D}$ is the vector $\boldsymbol{x}$ with its ith entry zeroed out and $e_{i}=[0, \ldots, 0,1,0, \ldots, 0]^{\top} \in \mathbb{R}^{D}$ is the ith basis vector. More generally, if the point $x$ has $M$ missing entries, without loss of generality we can partition it as $\left[\begin{array}{l}\boldsymbol{x}{U} \ \boldsymbol{x}{O}\end{array}\right]$, where $\boldsymbol{x}{U} \in \mathbb{R}^{M}$ denotes the unobserved entries and $x{O} \in \mathbb{R}^{D-M}$ denotes the observed entries. Thus, $x$ is known only up to the following $M$-dimensional affine subspace:
$$x \in L \doteq\left{\left[\begin{array}{c} 0 \ x_{O} \end{array}\right]+\left[\begin{array}{c} I_{M} \ 0 \end{array}\right] x_{U}, x_{U} \in \mathbb{R}^{M}\right}$$
Incomplete PCA When the Subspace Is Known
Let us first consider the simplest case, in which the subspace $S$ is known. Then we know that the point $\boldsymbol{x}$ belongs to both $L$ and $S$. Therefore, given the parameters $\mu$ and $U$ of the subspace $S$, we can compute the principal components $y$ and the missing entries $\boldsymbol{x}_{U}$ by intersecting $L$ and $S$. In the case of one missing entry (illustrated in Figure 3.1), the intersection point can be computed from a $\boldsymbol{x}=\boldsymbol{x}{-i}+x{i} \boldsymbol{e}{i}=\boldsymbol{\mu}+U \boldsymbol{y} \Longrightarrow\left[U-\boldsymbol{e}{i}\right]\left[\begin{array}{l}\boldsymbol{y} \ x_{i}\end{array}\right]=\boldsymbol{x}_{-i}-\boldsymbol{\mu} .$

## 统计代写|主成分分析代写Principal Component Analysis代考|Incomplete PCA by Mean and Covariance Completion

Recall from Section 2.1.2 that the optimization problem associated with geometric PCA is
$$\min {\mu, U,\left{y{j}\right}} \sum_{j=1}^{N}\left|x_{j}-\mu-U y_{j}\right|^{2} \text { s.t. } U^{\top} U=I_{d} \text { and } \sum_{j=1}^{N} y_{j}=\mathbf{0} .$$
We already know that the solution to this problem can be obtained from the mean and covariance of the data points,
$$\hat{\mu}{N}=\frac{1}{N} \sum{j=1}^{N} x_{j} \quad \text { and } \quad \hat{\Sigma}{N}=\frac{1}{N} \sum{j=1}^{N}\left(x_{j}-\hat{\mu}{N}\right)\left(x{j}-\hat{\mu}{N}\right)^{\top}$$ respectively. Specifically, $\boldsymbol{\mu}$ is given by the sample mean $\hat{\mu}{N}, U$ is given by the top $d$ eigenvectors of the covariance matrix $\hat{\Sigma}{N}$, and $y{j}=U^{\top}\left(x_{j}-\mu\right)$. Alternatively, an optimal solution can be found from the rank- $d$ SVD of the mean-subtracted data matrix $\left[x_{1}-\hat{\mu}{N}, \ldots, x{N}-\hat{\mu}_{N}\right]$, as shown in Theorem $2.3$.

When some entries of each $\boldsymbol{x}{j}$ are missing, we cannot directly compute $\hat{\mu}{N}$ or $\hat{\Sigma}_{N}$ as in (3.11). A straightforward method for dealing with missing entries was introduced in (Jolliffe 2002). It basically proposes to compute the sample mean and covariance from the known entries of $X$. Specifically, the entries of the incomplete mean and covariance can be computed as
$$\hat{\mu}{i}=\frac{\sum{j=1}^{N} w_{i j} x_{i j}}{\sum_{j=1}^{N} w_{i j}} \text { and } \hat{\sigma}{i k}=\frac{\sum{j=1}^{N} w_{i j} w_{k j}\left(x_{i j}-\hat{\mu}{i}\right)\left(x{k j}-\hat{\mu}{k}\right)}{\sum{j=1}^{N} w_{i j} w_{k j}}$$
where $i, k=1, \ldots, D$. However, as discussed in (Jolliffe 2002), this simple approach has several disadvantages. First, the estimated covariance matrix need not be positive semidefinite. Second, these estimates are not obtained by optimizing any statistically or geometrically meaningful objective function (least squares, maximum likelihood, etc.) Nonetheless, estimates $\hat{\mu}{N}$ and $\hat{\Sigma}{N}$ obtained from the naive approach in (3.12) may be used to initialize the methods discussed in the next two sections, which are iterative in nature. For example, we may initialize the columns of $U$ as the eigenvectors of $\hat{\Sigma}{N}$ associated with its $d$ largest eigenvalues. Then given $\hat{\mu}{N}$ and $\hat{U}$, we can complete each missing entry as described in (3.6).

