2022年10月10日

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础
One learns early on when dealing with matrices that in general they do not commute (indeed, in general $A B \neq B A$ ). Sometimes, though, one does encounter commuting matrices; for example, if they are matrix representations of taking partial derivatives with respect to different variables on a vector space of “nice” functions. It is of interest to relate such commuting matrices to one another. We focus on the case when one of the matrices is nonderogatory.

We call a matrix nonderogatory if the matrix only has a single Jordan block associated with each eigenvalue. The following results is easily proven.

Proposition 4.6.1 Let $A \in \mathbb{F}^{n \times n}$. The following are equivalent.
(i) A is nonderogatory.
(ii) $w_1(A, \lambda)=\operatorname{dim} \operatorname{Ker}\left(A-\lambda I_n\right)=1$ for every eigenvalue $\lambda$ of $A$.
(iii) $m_A(t)=p_A(t)$.
The main result of this section is the following. We say that matrices $A$ and $B$ commute if $A B=B A$.
Theorem 4.6.2 Let $A \in \mathbb{F}^{n \times n}$ be nonderogatory with $p_A(\lambda)=\prod_{i=1}^m\left(\lambda-\lambda_i\right)^{n_i}$ with $\lambda_1, \ldots, \lambda_m \in \mathbb{F}$ all different. Then $B \in \mathbb{F}^{n \times n}$ commutes with $A$ if and only if there exists a polynomial $p(X) \in \mathbb{F}[X]$ so that $B=p(A)$. In that case, one can always choose $p(X)$ to have degree $\leq n-1$.
When $A$ is not nonderogatory, there is no guarantee that commuting matrices have to be of the form $p(A)$, as the following example shows.
Example 4.6.3 Let $\mathbb{F}=\mathbb{R}, A=\left(\begin{array}{ll}1 & 0 \ 0 & 1\end{array}\right)$, and $B=\left(\begin{array}{ll}1 & 2 \ 0 & 3\end{array}\right)$. Clearly $A B=B A$. If $p(X)$ is some polynomial, then $p(A)=\left(\begin{array}{cc}p(1) & 0 \ 0 & p(1)\end{array}\right)$, which never equals $B$.
We will need the following result.

One matrix function that is of particular interest is the resolvent. The resolvent of a matrix $A \in \mathbb{C}^{n \times n}$ is the function
$$R(\lambda):=\left(\lambda I_n-A\right)^{-1},$$
which is well-defined on $\mathbb{C} \backslash \sigma(A)$, where $\sigma(A)={z \in \mathbb{C}: z$ is an eigenvalue of $A}$ is the spectrum of $A$. We have the following observation.

Proposition 4.9.1 Let $A \in \mathbb{C}^{n \times n}$ with minimal polynomial $m_A(t)=\prod_{j=1}^m\left(t-\lambda_j\right)^{k_j}$, and let $P_{j k}, j=1, \ldots, m, k=0, \ldots, k_j-1$, be as in Theorem 4.8.4. Then
$$R(\lambda)=\left(\lambda I_n-A\right)^{-1}=\sum_{j=1}^m \sum_{k=0}^{n_j-1} \frac{k !}{\left(\lambda-\lambda_j\right)^{k+1}} P_{j k} .$$
Proof. Fix $\lambda \in \mathbb{C} \backslash \sigma(A)$, and define $g(z)=\frac{1}{\lambda-z}$, which is well-defined and $k$ times differentiable for every $k \in \mathbb{N}$ on the domain $\mathbb{C} \backslash{\lambda}$. Notice that $g(A)=\left(\lambda I_n-A\right)^{-1}=R(\lambda)$. Also observe that
$$g^{\prime}(t)=\frac{1}{(\lambda-t)^2}, g^{\prime \prime}(t)=\frac{2}{(\lambda-t)^3}, \ldots, g^{(k)}(t)=\frac{k !}{(\lambda-t)^{k+1}} .$$
Thus, by Theorem 4.8.4,
$$R(\lambda)=g(A)=\sum_{j=1}^m \sum_{k=0}^{n_j-1} g^{(k)}(t) P_{j k}=\sum_{j=1}^m \sum_{k=0}^{n_j-1} \frac{k !}{\left(\lambda-\lambda_j\right)^{k+1}} P_{j k} .$$
If we make use of a fundamental complex analysis result, Cauchy’s integral formula, we can develop an integral formula for $f(A)$ that is used, for instance, in analyzing differential operators. Let us start by stating Cauchy’s result. A function $f$ of a complex variable is called analytic on an open set $D \subseteq \mathbb{C}$ if $f$ is continuously differentiable at every point $z \in D$.

