Characteristic Polynomial

NoteQuestion

How do we find the eigenvalues of a matrix?

We will work with a matrix \(\displaystyle \mathbf{A}\) of shape \(\displaystyle n\times n\).

Necessary condition

If \(\mathbf{x}\) is an eigenvector of a matrix \(\mathbf{A}\) with eigenvalue \(\lambda\), then we have:

\[ \mathbf{Ax} =\lambda \mathbf{x} \]

Let us rearrange this a bit. Observe that if \(\mathbf{I}\) is the identity matrix, then:

\[ \mathbf{Ix} =\mathbf{x} \]

Using this fact, we can express the eigenvalue equation as:

\[ \begin{aligned} \mathbf{Ax} & =\lambda \mathbf{Ix}\\ & \\ \mathbf{Ax} -\lambda \mathbf{Ix} & =\mathbf{0}\\ & \\ (\mathbf{A} -\lambda \mathbf{I} )\mathbf{x} & =0 \end{aligned} \]

If \((\lambda ,\mathbf{x} )\) is an eigenpair, then \((\mathbf{A} -\lambda \mathbf{I} )\mathbf{x} =0\). Another way of expressing this is:

ImportantNecessary condition

\((\mathbf{A} -\lambda \mathbf{I} )\mathbf{x} =0\) is a necessary condition for \(\mathbf{x}\) to be an eigenvector of \(\mathbf{A}\) with eigenvalue \(\lambda\).

Sufficient condition

Let us try the other direction. What if \((\mathbf{A} -\lambda \mathbf{I} )\mathbf{x} =\mathbf{0}\)? We have:

\[ \begin{aligned} (\mathbf{A} -\lambda \mathbf{I} )\mathbf{x} & =\mathbf{0}\\ & \\ \mathbf{Ax} -\lambda \mathbf{Ix} & =\mathbf{0}\\ & \\ \mathbf{Ax} & =\lambda \mathbf{x} \end{aligned} \]

Thus we see that \(\mathbf{x}\) is an eigenvector of \(\mathbf{A}\) with eigenvalue \(\lambda\).

ImportantSufficient condition

\((\mathbf{A} -\lambda \mathbf{I} )\mathbf{x} =\mathbf{0}\) is a sufficient condition for \(\mathbf{x}\) to be an eigenvector of \(\mathbf{A}\) with eigenvalue \(\lambda\).

Nullspace argument

Thus we see that \((\lambda ,\mathbf{x} )\) is an eigenpair of \(\mathbf{A}\) if and only if \((\mathbf{A} -\lambda \mathbf{I} )\mathbf{x} =\mathbf{0}\). From this, we deduce that \(\mathbf{x}\) has to be in the nullspace of \(\mathbf{A} -\lambda \mathbf{I}\). Since \(\displaystyle \mathbf{x}\) is an eigenvector, \(\displaystyle \mathbf{x} \neq 0\). Thus the nullspace of \(\mathbf{A} -\lambda \mathbf{I}\) is non-trivial, meaning, it has at least one-nonzero vector in it, namely \(\displaystyle \mathbf{x}\). Putting it differently, \(\displaystyle (\mathbf{A} -\lambda \mathbf{I})\mathbf{x} =\mathbf{0}\) has at least one non-trivial solution. It follows that \(\displaystyle \mathbf{A} -\lambda \mathbf{I}\) cannot be invertible. Had \(\displaystyle \mathbf{A} -\lambda \mathbf{I}\) been invertible, then it would have had only one solution, which is the zero vector. But we have just demonstrated that there are non-zero solutions. Finally, if \(\displaystyle \mathbf{A} -\lambda \mathbf{I}\) is not invertible, then its determinant is zero. So, we have the following result:

ImportantNullspace argument

\(\lambda\) is an eigenvalue of \(\mathbf{A}\) if and only if \(|\mathbf{A} -\lambda \mathbf{I} |=0\)

Notice how we end up with a result that doesn’t involve the eigenvector and only the matrix and its eigenvalues. We also note the following way of expressing this:

NoteRemark

The eigenspace corresponding to an eigenvalue \(\lambda\) is the nullspace of \(\mathbf{A} -\lambda \mathbf{I}\). That is \(\displaystyle E( \lambda ) =\mathcal{N}(\mathbf{A} -\lambda \mathbf{I})\).

The Polynomial

At this stage, let us take up an example:

\[ \mathbf{A} =\begin{bmatrix} 3 & 1\\ 0 & 2 \end{bmatrix} \]

For \(\lambda\) to be an eigenvalue of the matrix \(\mathbf{A}\), \(|\mathbf{A} -\lambda \mathbf{I} |=0\). This translates to:

\[ \begin{vmatrix} 3-\lambda & 1\\ 0 & 2-\lambda \end{vmatrix} =0 \]

Expanding the determinant, we get:

\[ \begin{aligned} (3-\lambda )(2-\lambda )=0 \end{aligned} \]

The expression on the LHS is a polynomial of degree \(2\) and is called the characteristic polynomial of \(\mathbf{A}\). We see that \(\lambda =2,3\) are the eigenvalues of the matrix \(\mathbf{A}\). Let us take up a slightly more complex example involving a \(\displaystyle 3\times 3\) matrix.

\[ \mathbf{A} =\begin{bmatrix} 1 & 3 & 3\\ 3 & 1 & -1\\ 0 & 0 & 2 \end{bmatrix} \]

We now find the characteristic polynomial:

\[ \begin{aligned} |\mathbf{A} -\lambda \mathbf{I} | & =\begin{vmatrix} 1-\lambda & 3 & 3\\ 3 & 1-\lambda & -1\\ 0 & 0 & 2-\lambda \end{vmatrix}\\ & \\ & =( 1-\lambda )[( 1-\lambda )( 2-\lambda )] -3[ 3( 2-\lambda )]\\ & \\ & =( 2-\lambda )\left( 1+\lambda ^{2} -2\lambda -9\right)\\ & \\ & =( 2-\lambda )\left( \lambda ^{2} -2\lambda -8\right)\\ & \\ & =( 2-\lambda )( \lambda +2)( \lambda -4) \end{aligned} \]

The characteristic polynomial has degree \(\displaystyle 3\) in this case and its roots are \(\displaystyle -2,2,5\). Therefore, the eigenvalues of \(\displaystyle \mathbf{A}\) are \(\displaystyle -2,2,4\).

Properties

In general, for an \(n\times n\) matrix \(\mathbf{A}\), the characteristic polynomial has degree \(n\). The roots of the characteristic polynomial are the eigenvalues of the matrix \(\mathbf{A}\). There are two properties of the polynomial that are worth knowing:

  • the sum of the eigenvalues of the matrix is equal to its trace
  • the product of the eigenvalues of the matrix is equal to its determinant

The second property is easy to see. Assume that the polynomial splits into \(n\) factors corresponding to the \(n\) eigenvalues \(\lambda _{1} ,\cdots ,\lambda _{n}\), then:

\[ |\mathbf{A} -\lambda \mathbf{I} |=(\lambda _{1} -\lambda )\cdots (\lambda _{n} -\lambda ) \]

If we set \(\lambda =0\), then:

\[ |\mathbf{A} |=\lambda _{1} \cdots \lambda _{n} \]

The property concerning the trace requires a bit more work and the proof of the same will be discussed in the appendix.

Summary

To find the eigenvalues of a matrix \(\mathbf{A}\), we first compute the characteristic polynomial \(|\mathbf{A} -\lambda \mathbf{I} |\). The roots of this polynomial are the eigenvalues of the matrix \(\mathbf{A}\). There are two important properties:

  • sum of the eigenvalues is equal to the trace of the matrix

  • product of the eigenvalues is equal to the determinant of the matrix