Skip to main content

Section 1.4 Invariant Subspaces

Subsection 1.4.1 Invariant Subspaces

Definition 1.4.1. Invariant Subspace.

Suppose that \(\ltdefn{T}{V}{V}\) is a linear transformation and \(W\) is a subspace of \(V\text{.}\) Suppose further that \(\lteval{T}{\vect{w}}\in W\) for every \(\vect{w}\in W\text{.}\) Then \(W\) is an invariant subspace of \(V\) relative to \(T\text{.}\)

We do not have any special notation for an invariant subspace, so it is important to recognize that an invariant subspace is always relative to both a superspace (\(V\)) and a linear transformation (\(T\)), which will sometimes not be mentioned, yet will be clear from the context. Note also that the linear transformation involved must have an equal domain and codomain — the definition would not make much sense if our outputs were not of the same type as our inputs.

As is our habit, we begin with an example that demonstrates the existence of invariant subspaces, while leaving other questions unanswered for the moment. We will return later to understand how this example was constructed, but for now, just understand how we check the existence of the invariant subspaces.

Consider the linear transformation \(\ltdefn{T}{\complex{4}}{\complex{4}}\) defined by \(\lteval{T}{\vect{x}}=A\vect{x}\) where \(A\) is given by

\begin{equation*} A=\begin{bmatrix} -8 & 6 & -15 & 9 \\ -8 & 14 & -10 & 18 \\ 1 & 1 & 3 & 0 \\ 3 & -8 & 2 & -11 \end{bmatrix} \end{equation*}

Define (with zero motivation),

\begin{align*} \vect{w_1}&=\colvector{-7\\-2\\3\\0} & \vect{w_2}&=\colvector{-1\\-2\\0\\1} \end{align*}

and set \(W=\spn{\set{\vect{w}_1,\,\vect{w}_2}}\text{.}\) We verify that \(W\) is an invariant subspace of \(\complex{4}\) with respect to \(T\text{.}\) By the definition of \(W\text{,}\) any vector chosen from \(W\) can be written as a linear combination of \(\vect{w}_1\) and \(\vect{w}_2\text{.}\) Suppose that \(\vect{w}\in W\text{,}\) and then check the details of the following verification,

\begin{align*} \lteval{T}{\vect{w}}&=\lteval{T}{a_1\vect{w}_1+a_2\vect{w}_2}&&\\ &=a_1\lteval{T}{\vect{w}_1}+a_2\lteval{T}{\vect{w}_2}&&\\ &=a_1\colvector{-1\\-2\\0\\1}+a_2\colvector{5\\-2\\-3\\2}\\ &=a_1\vect{w}_2+a_2\left((-1)\vect{w}_1+2\vect{w}_2\right)\\ &=(-a_2)\vect{w}_1+(a_1+2a_2)\vect{w}_2\in W \end{align*}

So, by Definition IS, \(W\) is an invariant subspace of \(\complex{4}\) relative to \(T\text{.}\)

In an entirely similar manner we construct another invariant subspace of \(T\text{.}\)With zero motivation, define

\begin{align*} \vect{x_1}&=\colvector{-3\\-1\\1\\0} & \vect{x_2}&=\colvector{0\\-1\\0\\1} \end{align*}

and set \(X=\spn{\set{\vect{x}_1,\,\vect{x}_2}}\text{.}\) We verify that \(X\) is an invariant subspace of \(\complex{4}\) with respect to \(T\text{.}\) By the definition of \(X\text{,}\) any vector chosen from \(X\) can be written as a linear combination of \(\vect{x}_1\) and \(\vect{x}_2\text{.}\) Suppose that \(\vect{x}\in X\text{,}\) and then check the details of the following verification,

\begin{align*} \lteval{T}{\vect{x}}&=\lteval{T}{b_1\vect{x}_1+b_2\vect{x}_2}&&\\ &=b_1\lteval{T}{\vect{x}_1}+b_2\lteval{T}{\vect{x}_2}&&\\ &=b_1\colvector{3\\0\\-1\\1}+b_2\colvector{3\\4\\-1\\-3}\\ &=b_1\left((-1)\vect{x}_1+\vect{x}_2\right)+b_2\left((-1)\vect{x}_1+(-3)\vect{x}_2\right)\\ &=(-b_1-b_2)\vect{x}_1+(b_1-3b_2)\vect{x}_2\in X \end{align*}

So, by Definition IS, \(X\) is an invariant subspace of \(\complex{4}\) relative to \(T\text{.}\)

There is a bit of magic in each of these verifications where the two outputs of \(T\) happen to equal linear combinations of the two inputs. But this is the essential nature of an invariant subspace. We'll have a peek under the hood later in Example Example 3.1.8, and it will not look so magical after all.

Verify that \(B=\set{\vect{w}_1,\,\vect{w}_2,\,\vect{x}_1,\,\vect{x}_2}\) is linearly independent, and hence a basis of \(\complex{4}\text{.}\) Splitting this basis in half, Theorem Theorem 1.2.3 tells us that \(\complex{4}=W\ds X\text{.}\) To see exactly why a decomposition of a vector space into a direct sum of invariant subspaces might be interesting work Exercise Checkpoint 1.4.3 now.

Construct a matrix representation of the linear transformation \(T\) of Exercise Example 1.4.2 relative to the basis formed as the union of the bases of the two invariant subspaces, \(\matrixrep{T}{B}{B}\text{.}\) Comment on your observations, perhaps after computing a few powers of the matrix representation (which represent repeated compositions of \(T\) with itself). Hmmmmmm.

Solution

Our basis is

\begin{equation*} B=\set{\vect{w}_1,\,\vect{w}_2,\,\vect{x}_1,\,\vect{x}_2} =\set{ \colvector{-7\\-2\\3\\0},\, \colvector{-1\\-2\\0\\1},\, \colvector{-3\\-1\\1\\0},\, \colvector{0\\-1\\0\\1} } \end{equation*}

Now we perform the necessary computions for the matrix representation of \(T\) relative to \(B\)

\begin{align*} \vectrep{B}{\lteval{T}{\vect{w}_1}} &=\vectrep{B}{\colvector{-1\\-2\\0\\1}} =\vectrep{B}{(0)\vect{w}_1+(1)\vect{w}_2} =\colvector{0\\1\\0\\0}\\ \vectrep{B}{\lteval{T}{\vect{w}_2}} &=\vectrep{B}{\colvector{5\\-2\\-3\\2}} =\vectrep{B}{(-1)\vect{w}_1+(2)\vect{w}_2} =\colvector{-1\\2\\0\\0}\\ \vectrep{B}{\lteval{T}{\vect{x}_1}} &=\vectrep{B}{\colvector{3\\0\\-1\\1}} =\vectrep{B}{(-1)\vect{x}_1+(1)\vect{x}_2} =\colvector{0\\0\\-1\\1}\\ \vectrep{B}{\lteval{T}{\vect{x}_2}} &=\vectrep{B}{\colvector{3\\4\\-1\\-3}} =\vectrep{B}{(-1)\vect{x}_1+(-3)\vect{x}_2} =\colvector{0\\0\\-1\\-3} \end{align*}

Applying Definition MR, we have

\begin{equation*} \matrixrep{T}{B}{B}= \begin{bmatrix} 0 & -1 & 0 & 0 \\ 1 & 2 & 0 & 0 \\ 0 & 0 & -1 & -1 \\ 0 & 0 & 1 & -3 \end{bmatrix} \end{equation*}

The interesting feature of this representation is the two \(2\times 2\) blocks on the diagonal that arise from the decomposition of \(\complex{4}\) into a direct sum of invariant subspaces. Or maybe the interesting feature of this matrix is the two \(2\times 2\) submatrices in the “other” corners that are all zero. You can decide.

Prove that the subspaces \(U, V\subseteq\complex{5}\) are invariant with respect to the linear transformation \(\ltdefn{R}{\complex{5}}{\complex{5}}\) defined by \(\lteval{R}{\vect{x}}=B\vect{x}\text{.}\)

\begin{equation*} B=\begin{bmatrix} 4 & 47 & 3 & -46 & 20 \\ 10 & 61 & 8 & -56 & 10 \\ -10 & -69 & -7 & 67 & -20 \\ 11 & 70 & 9 & -64 & 12 \\ 3 & 19 & 3 & -16 & 1\end{bmatrix} \end{equation*}
\begin{align*} U&=\spn{\set{ \colvector{1\\0\\-1\\0\\0},\, \colvector{0\\1\\-1\\1\\0} }} & V&=\spn{\set{ \colvector{1\\1\\-1\\1\\0},\, \colvector{3\\3\\-2\\4\\2},\, \colvector{-2\\3\\-2\\3\\1} }} \end{align*}

Prove that the union of \(U\) and \(V\) is a basis of \(\complex{5}\text{,}\) and then provide a matrix representation of \(R\) relative to this basis.

Example Example 1.4.2 and Exercise Checkpoint 1.4.4 are a bit mysterious at this stage. Do we know any other examples of invariant subspaces? Yes, as it turns out, we have already seen quite a few. We will give some specific examples, and for more general situations, describe broad classes of invariant subspaces by theorems. First up is eigenspaces.

Choose \(\vect{w}\in W\text{.}\) Then

\begin{equation*} \lteval{T}{\vect{w}}=\lambda\vect{w}\in W. \end{equation*}

So by Definition Definition 1.4.1, \(W\) is an invariant subspace of \(V\) relative to \(T\text{.}\)

Theorem Theorem 1.4.5 is general enough to determine that an entire eigenspace is an invariant subspace, or that simply the span of a single eigenvector is an invariant subspace. It is not always the case that any subspace of an invariant subspace is again an invariant subspace, but eigenspaces do have this property. Here is an example of the theorem, which also allows us to very quickly build several invariant subspaces.

Define the linear transformation \(\ltdefn{S}{M_{22}}{M_{22}}\) by

\begin{equation*} \lteval{S}{\begin{bmatrix}a&b\\c&d\end{bmatrix}} = \begin{bmatrix} -2a + 19b - 33c + 21d & -3a + 16b - 24c + 15d \\ -2a + 9b - 13c + 9d & -a + 4b - 6c + 5d \end{bmatrix} \end{equation*}

Build a matrix representation of \(S\) relative to the standard basis (Definition MR) and compute eigenvalues and eigenspaces of \(S\) with the computational techniques of Chapter E in concert with Theorem EER. Then

\begin{align*} \eigenspace{S}{1}&=\spn{\set{\begin{bmatrix}4&3\\2&1\end{bmatrix}}} & \eigenspace{S}{2}&=\spn{\set{ \begin{bmatrix} 6& 3\\1&0\end{bmatrix},\, \begin{bmatrix}-9&-3\\0&1\end{bmatrix} }} \end{align*}

So by Theorem Theorem 1.4.5, both \(\eigenspace{S}{1}\) and \(\eigenspace{S}{2}\) are invariant subspaces of \(M_{22}\) relative to \(S\text{.}\)

However, Theorem Theorem 1.4.5 provides even more invariant subspaces. Since \(\eigenspace{S}{1}\) has dimension 1, it has no interesting subspaces, however \(\eigenspace{S}{2}\) has dimension 2 and has a plethora of subspaces. For example, set

\begin{equation*} \vect{u}= 2\begin{bmatrix} 6& 3\\1&0\end{bmatrix}+ 3\begin{bmatrix}-9&-3\\0&1\end{bmatrix} = \begin{bmatrix}-6&-3\\2&3\end{bmatrix} \end{equation*}

and define \(U=\spn{\set{\vect{u}}}\text{.}\) Then since \(U\) is a subspace of \(\eigenspace{S}{2}\text{,}\) Theorem Theorem 1.4.5 says that \(U\) is an invariant subspace of \(M_{22}\) (or we could check this claim directly based simply on the fact that \(\vect{u}\) is an eigenvector of \(S\)).

For every linear transformation there are some obvious, trivial invariant subspaces. Suppose that \(\ltdefn{T}{V}{V}\) is a linear transformation. Then simply because \(T\) is a function, the subspace \(V\) is an invariant subspace of \(T\text{.}\) In only a minor twist on this theme, the range of \(T\text{,}\) \(\rng{T}\text{,}\) is an invariant subspace of \(T\) by Definition RLT. Finally, Theorem LTTZZ provides the justification for claiming that \(\set{\zerovector}\) is an invariant subspace of \(T\text{.}\)

That the trivial subspace is always an invariant subspace is a special case of the next theorem. But first, work the following straightforward exercise before reading the next theorem. We'll wait.

Prove that the kernel of a linear transformation (Definition KLT), \(\krn{T}\text{,}\) is an invariant subspace.

Suppose that \(\vect{z}\in\krn{T^k}\text{.}\) Then

\begin{equation*} \lteval{T^k}{\lteval{T}{\vect{z}}}=\lteval{T^{k+1}}{\vect{z}}=\lteval{T}{\lteval{T^k}{\vect{z}}}=\lteval{T}{\zerovector}=\zerovector. \end{equation*}

So by Definition KLT, we see that \(\lteval{T}{\vect{z}}\in\krn{T^k}\text{.}\) Thus \(\krn{T^k}\) is an invariant subspace of \(V\) relative to \(T\) by Definition Definition 1.4.1.

Two special cases of Theorem Theorem 1.4.8 occur when we choose \(k=0\) and \(k=1\text{.}\) The next example is unusual, but a good illustration.

Consider the \(10\times 10\) matrix \(A\) below as defining a linear transformation \(\ltdefn{T}{\complex{10}}{\complex{10}}\text{.}\) We also provide a Sage version of the matrix for use online.

\begin{equation*} A=\begin{bmatrix} -1 & -9 & -24 & -16 & -40 & -36 & 72 & 66 & 21 & 59 \\ 19 & 4 & 7 & -18 & 2 & -12 & 67 & 69 & 16 & -35 \\ -1 & 1 & 2 & 5 & 5 & 6 & -17 & -17 & -5 & -8 \\ 2 & -2 & -7 & -8 & -13 & -13 & 32 & 28 & 11 & 14 \\ -11 & -2 & -1 & 12 & 6 & 11 & -50 & -44 & -16 & 13 \\ 4 & -1 & -5 & -14 & -16 & -16 & 55 & 43 & 24 & 17 \\ -14 & 1 & 7 & 20 & 19 & 26 & -82 & -79 & -21 & -1 \\ 12 & 0 & -4 & -17 & -14 & -20 & 68 & 64 & 19 & -2 \\ 10 & -2 & -9 & -16 & -20 & -24 & 68 & 65 & 17 & 9 \\ -1 & -2 & -5 & -3 & -8 & -7 & 13 & 12 & 4 & 14 \end{bmatrix} \end{equation*}

The matrix \(A\) has rank \(9\) and so \(T\) has a nontrivial kernel. But it gets better. \(T\) has been constructed specially so that the nullity of \(T^k\) is exactly \(k\text{,}\) for \(0\leq k\leq 10\text{.}\) This is an extremely unusual situation, but is a good illustration of a very general theorem about kernels of null spaces, coming next. We compute the invariant subspace \(\krn{T^5}\text{,}\) you can practice with others.

We work with the matrix, recalling that null spaces and column spaces of matrices correspond to kernels and ranges of linear transformations once we understand matrix representations of linear transformations (Section MR).

\begin{align*} A^5&=\begin{bmatrix} 37 & 24 & 65 & -35 & 77 & 32 & 80 & 98 & 23 & -125 \\ 19 & 11 & 37 & -21 & 46 & 19 & 29 & 49 & 6 & -70 \\ -7 & -5 & -15 & 8 & -18 & -8 & -15 & -19 & -6 & 26 \\ 14 & 9 & 27 & -15 & 33 & 14 & 26 & 37 & 7 & -50 \\ -8 & -7 & -25 & 14 & -33 & -16 & -10 & -23 & -5 & 37 \\ 12 & 11 & 35 & -19 & 45 & 22 & 22 & 35 & 11 & -52 \\ -27 & -18 & -56 & 31 & -69 & -30 & -49 & -72 & -15 & 100 \\ 20 & 14 & 45 & -25 & 56 & 25 & 35 & 54 & 12 & -77 \\ 24 & 16 & 49 & -27 & 60 & 26 & 45 & 64 & 14 & -88 \\ 8 & 5 & 13 & -7 & 15 & 6 & 18 & 21 & 5 & -26 \end{bmatrix}\\ \krn{T^5}&=\nsp{A^5}=\spn{\set{ \colvector{1\\-1\\0\\0\\-1\\2\\0\\0\\0\\0},\, \colvector{-1\\-3\\-3\\-2\\2\\0\\1\\0\\0\\0},\, \colvector{-2\\-1\\0\\0\\0\\0\\0\\1\\0\\0},\, \colvector{1\\-1\\-4\\-2\\2\\0\\0\\0\\1\\0},\, \colvector{5\\-3\\1\\1\\1\\0\\0\\0\\0\\2} }} \end{align*}

As an illustration of \(\krn{T^5}\) as a subspace invariant under \(T\text{,}\) we form a linear combination of the five basis vectors (named \(\vect{z}_i\text{,}\) in order), which will be then be an element of the invariant subspace. We apply \(T\text{,}\) so the output should again be in the subspace, which we verify by giving an expression for the output as a linear combination of the basis vectors for the subspace.

\begin{align*} \vect{z}&=3\vect{z}_1 - \vect{z}_2 + 3\vect{z}_3 + 2\vect{z}_4 - 2\vect{z}_5=\colvector{-10\\1\\-9\\-6\\-3\\6\\-1\\3\\2\\-4}\\ \lteval{T}{\vect{z}}&=A\vect{z}=\colvector{149\\93\\-28\\68\\-73\\94\\-136\\110\\116\\28}\\ &=47\vect{z}_1 - 136\vect{z}_2 + 110\vect{z}_3 +116\vect{z}_4 +14\vect{z}_5 \end{align*}

Reprise Example Example 1.4.9 using the same linear transformation. Use a different power (not \(k=0,1,5,9,10\) on your first attempt), form a vector in the kernel of your chosen power, then apply \(T\) to it. Your output should be in the kernel. (Check this with Sage by using the in Python operator.) Thus, you should be able to write the output as a linear combination of the basis vectors. Rinse, repeat.

Subsection 1.4.2 Restrictions of Linear Transformations

A primary reason for our interest in invariant subspaces is they provide us with another method for creating new linear transformations from old ones.

Definition 1.4.11. Linear Transformation Restriction.

Suppose that \(\ltdefn{T}{V}{V}\) is a linear transformation, and \(U\) is an invariant subspace of \(V\) relative to \(T\text{.}\) Define the restriction of \(T\) to \(U\) by

\begin{align*} &\ltdefn{\restrict{T}{U}}{U}{U} &\lteval{\restrict{T}{U}}{\vect{u}}&=\lteval{T}{\vect{u}} \end{align*}

It might appear that this definition has not accomplished anything, as \(\restrict{T}{U}\) would appear to take on exactly the same values as \(T\text{.}\) And this is true. However, \(\restrict{T}{U}\) differs from \(T\) in the choice of domain and codomain. We tend to give little attention to the domain and codomain of functions, while their defining rules get the spotlight. But the restriction of a linear transformation is all about the choice of domain and codomain. We are restricting the rule of the function to a smaller subspace. Notice the importance of only using this construction with an invariant subspace, since otherwise we cannot be assured that the outputs of the function are even contained in the codomain. This observation should be the key step in the proof of a theorem saying that \(\restrict{T}{U}\) is also a linear transformation, but leave that as an exercise.

In Exercise Checkpoint 1.4.4 you verified that the subspaces \(U, V\subseteq\complex{5}\) are invariant with respect to the linear transformation \(\ltdefn{R}{\complex{5}}{\complex{5}}\) defined by \(\lteval{R}{\vect{x}}=B\vect{x}\text{.}\)

\begin{equation*} B=\begin{bmatrix} 4 & 47 & 3 & -46 & 20 \\ 10 & 61 & 8 & -56 & 10 \\ -10 & -69 & -7 & 67 & -20 \\ 11 & 70 & 9 & -64 & 12 \\ 3 & 19 & 3 & -16 & 1\end{bmatrix} \end{equation*}
\begin{align*} U&=\spn{\set{ \colvector{1\\0\\-1\\0\\0},\, \colvector{0\\1\\-1\\1\\0} }} & V&=\spn{\set{ \colvector{1\\1\\-1\\1\\0},\, \colvector{3\\3\\-2\\4\\2},\, \colvector{-2\\3\\-2\\3\\1} }} \end{align*}

It is a simple matter to define two new linear transformations, \(\restrict{R}{U}, \restrict{R}{V}\text{,}\)

\begin{align*} &\ltdefn{\restrict{R}{U}}{U}{U} & \lteval{\restrict{R}{U}}{\vect{x}}&=B\vect{x}\\ &\ltdefn{\restrict{R}{V}}{V}{V} & \lteval{\restrict{R}{V}}{\vect{x}}&=B\vect{x} \end{align*}

It should not look like we have accomplished much. Worse, it might appear that \(R=\restrict{R}{U}=\restrict{R}{V}\) since each is described by the same rule for computing the image of an input. The difference is that the domains are all different: \(\complex{5},U,V\text{.}\) Since \(U\) and \(V\) are invariant subspaces, we can then use these subspaces for the codomains of the restrictions.

We will frequently need the matrix representations of linear transformation restrictions, so let's compute those now for this example. Let

\begin{align*} C&=\set{\vect{u}_1,\,\vect{u}_2} & D&=\set{\vect{v}_1,\,\vect{v}_2,\,\vect{v}_3} \end{align*}

be the bases for \(U\) and \(V\text{,}\) respectively.

For \(U\)

\begin{align*} \vectrep{C}{\lteval{\restrict{R}{U}}{\vect{u}_1}} &=\vectrep{C}{\colvector{1\\2\\-3\\2\\0}} =\vectrep{C}{1\vect{u}_1+2\vect{u}_2} =\colvector{1\\2}\\ \vectrep{C}{\lteval{\restrict{R}{U}}{\vect{u}_2}} &=\vectrep{C}{\colvector{-2\\-3\\5\\-3\\0}} =\vectrep{C}{-2\vect{u}_1-3\vect{u}_2} =\colvector{-2\\-3} \end{align*}

Applying Definition MR, we have

\begin{equation*} \matrixrep{\restrict{R}{U}}{C}{C}= \begin{bmatrix} 1 & -2\\ 2 & -3 \end{bmatrix} \end{equation*}

For \(V\)

\begin{align*} \vectrep{D}{\lteval{\restrict{R}{V}}{\vect{v}_1}} &=\vectrep{D}{\colvector{2, 7, -5, 8, 3}} =\vectrep{D}{\vect{v}_1+\vect{v}_2+\vect{v}_3} =\colvector{1\\1\\1}\\ \vectrep{D}{\lteval{\restrict{R}{V}}{\vect{v}_2}} &=\vectrep{D}{\colvector{3, -7, 5, -7, -2}} =\vectrep{D}{-\vect{v}_1+2\vect{v}_3} =\colvector{-1\\0\\2}\\ \vectrep{D}{\lteval{\restrict{R}{V}}{\vect{v}_3}} &=\vectrep{D}{\colvector{9, -11, 8, -10, -2}} =\vectrep{D}{-2\vect{v}_1+\vect{v}_2+4\vect{v}_3} =\colvector{-2\\1\\4} \end{align*}

Applying Definition MR, we have

\begin{equation*} \matrixrep{\restrict{R}{V}}{D}{D}= \begin{bmatrix} 1 & -1 & -2\\ 1 & 0 & 1\\ 1 & 2 & 4 \end{bmatrix} \end{equation*}

It is no accident that these two square matrices are the diagonal blocks of the matrix representation you built for \(R\) relative to the basis \(C\cup D\) in Exercise Checkpoint 1.4.4.

The key observation of the previous example is worth stating very clearly: A basis derived from a direct sum decomposition into subspaces that are invariant relative to a linear transformation will provide a matrix representation of the linear transformation with a block diagonal form.

Diagonalizing a linear transformation is the most extreme example of decomposing a vector space into invariant subspaces. When a linear transformation is diagonalizable, then there is a basis composed of eigenvectors (Theorem DC). Each of these basis vectors can be used individually as the lone element of a basis for an invariant subspace (Theorem EIS). So the domain decomposes into a direct sum of one-dimensional invariant subspaces via Theorem Theorem 1.2.3. The corresponding matrix representation is then block diagonal with all the blocks of size 1, i.e. the matrix is diagonal. Section [provisional cross-reference: section-jordan-canonical-form] is devoted to generalizing this extreme situation when there are not enough eigenvectors available to make such a complete decomposition and arrive at such an elegant matrix representation.

You can find more examples of invariant subspaces, linear transformation restrictions and matrix representations in Sections Section 3.1, [provisional cross-reference: section-nilpotent-linear-transformations], [provisional cross-reference: section-jordan-canonical-form].