Skip to main content

Section 1.2 Direct Sums

Subsection 1.2.1 Direct Sums

Some of the more advanced ideas in linear algebra are closely related to decomposing (Proof Technique DC) vector spaces into direct sums of subspaces. A direct sum is a short-hand way to describe the relationship between a vector space and two, or more, of its subspaces. As we will use it, it is not a way to construct new vector spaces from others.

Definition 1.2.1. Direct Sum.

Suppose that \(V\) is a vector space with two subspaces \(U\) and \(W\) such that for every \(\vect{v}\in V\text{,}\)

  1. There exists vectors \(\vect{u}\in U\text{,}\) \(\vect{w}\in W\) such that \(\vect{v}=\vect{u}+\vect{w}\)
  2. If \(\vect{v}=\vect{u}_1+\vect{w}_1\) and \(\vect{v}=\vect{u}_2+\vect{w}_2\) where \(\vect{u}_1,\,\vect{u}_2\in U\text{,}\) \(\vect{w}_1,\,\vect{w}_2\in W\) then \(\vect{u}_1=\vect{u}_2\) and \(\vect{w}_1=\vect{w}_2\text{.}\)

Then \(V\) is the direct sum of \(U\) and \(W\) and we write \(V=U\ds W\text{.}\)

Informally, when we say \(V\) is the direct sum of the subspaces \(U\) and \(W\text{,}\) we are saying that each vector of \(V\) can always be expressed as the sum of a vector from \(U\) and a vector from \(W\text{,}\) and this expression can only be accomplished in one way (i.e. uniquely). This statement should begin to feel something like our definitions of nonsingular matrices (Definition NM) and linear independence (Definition LI). It should not be hard to imagine the natural extension of this definition to the case of more than two subspaces. Could you provide a careful definition of \(V=U_1\ds U_2\ds U_3\ds \dots\ds U_m\) (Exercise PD.M50)?

In \(\complex{3}\text{,}\) define

\begin{align*} \vect{v}_1&=\colvector{3\\2\\5} & \vect{v}_2&=\colvector{-1\\2\\1} & \vect{v}_3&=\colvector{2\\1\\-2} \end{align*}

Then \(\complex{3}=\spn{\set{\vect{v}_1,\,\vect{v}_2}}\ds\spn{\set{\vect{v}_3}}\text{.}\) This statement derives from the fact that \(B=\set{\vect{v}_1,\,\vect{v}_2,\,\vect{v}_3}\) is basis for \(\complex{3}\text{.}\) The spanning property of \(B\) yields the decomposition of any vector into a sum of vectors from the two subspaces, and the linear independence of \(B\) yields the uniqueness of the decomposition. We will illustrate these claims with a numerical example.

Choose \(\vect{v}=\colvector{10\\1\\6}\text{.}\) Then

\begin{equation*} \vect{v}=2\vect{v}_1+(-2)\vect{v}_2+1\vect{v}_3 =\left(2\vect{v}_1+(-2)\vect{v}_2\right)+\left(1\vect{v}_3\right) \end{equation*}

where we have added parentheses for emphasis. Obviously \(1\vect{v_3}\in\spn{\set{\vect{v}_3}}\text{,}\) while \(2\vect{v}_1+(-2)\vect{v}_2\in\spn{\set{\vect{v}_1,\,\vect{v}_2}}\text{.}\) Theorem VRRB provides the uniqueness of the scalars in these linear combinations.

Example SDS is easy to generalize into a theorem.

Choose any vector \(\vect{v}\in V\text{.}\) Then by Theorem VRRB there are unique scalars, \(\scalarlist{a}{n}\) such that

\begin{align*} \vect{v}&=\lincombo{a}{v}{n}\\ &=\left(a_1\vect{v}_1+a_2\vect{v}_2+a_3\vect{v}_3+\dots+a_m\vect{v}_m\right)+\\ &=\left(a_{m+1}\vect{v}_{m+1}+a_{m+2}\vect{v}_{m+2}+a_{m+3}\vect{v}_{m+3}+\dots+a_n\vect{v}_n\right)\\ &=\vect{u}+\vect{w} \end{align*}

where we have implicitly defined \(\vect{u}\) and \(\vect{w}\) in the last line. It should be clear that \(\vect{u}\in{U}\text{,}\) and similarly, \(\vect{w}\in{W}\) (and not simply by the choice of their names).

Suppose we had another decomposition of \(\vect{v}\text{,}\) say \(\vect{v}=\vect{u}^\ast+\vect{w}^\ast\text{.}\) Then we could write \(\vect{u}^\ast\) as a linear combination of \(\vect{v}_1\) through \(\vect{v}_m\text{,}\) say using scalars \(\scalarlist{b}{m}\text{.}\) And we could write \(\vect{w}^\ast\) as a linear combination of \(\vect{v}_{m+1}\) through \(\vect{v}_n\text{,}\) say using scalars \(\scalarlist{c}{n-m}\text{.}\) These two collections of scalars would then together give a linear combination of \(\vect{v}_1\) through \(\vect{v}_n\) that equals \(\vect{v}\text{.}\) By the uniqueness of \(\scalarlist{a}{n}\text{,}\) \(a_i=b_i\) for \(1\leq i\leq m\) and \(a_{m+i}=c_{i}\) for \(1\leq i\leq n-m\text{.}\) From the equality of these scalars we conclude that \(\vect{u}=\vect{u}^\ast\) and \(\vect{w}=\vect{w}^\ast\text{.}\) So with both conditions of Definition Definition 1.2.1 fulfilled we see that \(V=U\ds W\text{.}\)

Given one subspace of a vector space, we can always find another subspace that will pair with the first to form a direct sum. The main idea of this theorem, and its proof, is the idea of extending a linearly independent subset into a basis with repeated applications of Theorem ELIS.

If \(U=V\text{,}\) then choose \(W=\set{\zerovector}\text{.}\) Otherwise, choose a basis \(B=\set{\vectorlist{v}{m}}\) for \(U\text{.}\) Then since \(B\) is a linearly independent set, Theorem ELIS tells us there is a vector \(\vect{v}_{m+1}\) in \(V\text{,}\) but not in \(U\text{,}\) such that \(B\cup\set{\vect{v}_{m+1}}\) is linearly independent. Define the subspace \(U_1=\spn{B\cup\set{\vect{v}_{m+1}}}\text{.}\)

We can repeat this procedure, in the case were \(U_1\neq V\text{,}\) creating a new vector \(\vect{v}_{m+2}\) in \(V\text{,}\) but not in \(U_1\text{,}\) and a new subspace \(U_2=\spn{B\cup\set{\vect{v}_{m+1},\,\vect{v}_{m+2}}}\text{.}\) If we continue repeating this procedure, eventually, \(U_k=V\) for some \(k\text{,}\) and we can no longer apply Theorem ELIS. No matter, in this case \(B\cup\set{\vect{v}_{m+1},\,\vect{v}_{m+2},\,\dots,\,\vect{v}_{m+k}}\) is a linearly independent set that spans \(V\text{,}\) i.e. a basis for \(V\text{.}\)

Define \(W=\spn{\set{\vect{v}_{m+1},\,\vect{v}_{m+2},\,\dots,\,\vect{v}_{m+k}}}\text{.}\) We now are exactly in position to apply Theorem Theorem 1.2.3 and see that \(V=U\ds W\text{.}\)

There are several different ways to define a direct sum. Our next two theorems give equivalences (Proof Technique E) for direct sums, and therefore could have been employed as definitions. The first should further cement the notion that a direct sum has some connection with linear independence.

The first condition is identical in the definition and the theorem, so we only need to establish the equivalence of the second conditions.

Assume that \(V=U\ds W\text{,}\) according to Definition Definition 1.2.1. By Property Z, \(\zerovector\in V\) and \(\zerovector=\zerovector+\zerovector\text{.}\) If we also assume that \(\zerovector=\vect{u}+\vect{w}\text{,}\) then the uniqueness of the decomposition gives \(\vect{u}=\zerovector\) and \(\vect{w}=\zerovector\text{.}\)

⇐ Suppose that \(\vect{v}\in V\text{,}\) \(\vect{v}=\vect{u}_1+\vect{w}_1\) and \(\vect{v}=\vect{u}_2+\vect{w}_2\) where \(\vect{u}_1,\,\vect{u}_2\in U\text{,}\) \(\vect{w}_1,\,\vect{w}_2\in W\text{.}\) Then

\begin{align*} \zerovector&=\vect{v}-\vect{v}&&\text{}\\ &=\left(\vect{u}_1+\vect{w}_1\right)-\left(\vect{u}_2+\vect{w}_2\right)\\ &=\left(\vect{u}_1-\vect{u}_2\right)+\left(\vect{w}_1-\vect{w}_2\right)&&\text{} \end{align*}

By Property AC, \(\vect{u}_1-\vect{u}_2\in U\) and \(\vect{w}_1-\vect{w}_2\in W\text{.}\) We can now apply our hypothesis, the second statement of the theorem, to conclude that

\begin{align*} \vect{u}_1-\vect{u}_2&=\zerovector & \vect{w}_1-\vect{w}_2&=\zerovector\\ \vect{u}_1&=\vect{u}_2 & \vect{w}_1&=\vect{w}_2 \end{align*}

which establishes the uniqueness needed for the second condition of the definition.

Our second equivalence lends further credence to calling a direct sum a decomposition. The two subspaces of a direct sum have no (nontrivial) elements in common.

The first condition is identical in the definition and the theorem, so we only need to establish the equivalence of the second conditions.

Assume that \(V=U\ds W\text{,}\) according to Definition Definition 1.2.1. By Property Z and Definition SI, \(\set{\zerovector}\subseteq U\cap W\text{.}\) To establish the opposite inclusion, suppose that \(\vect{x}\in U\cap W\text{.}\) Then, since \(\vect{x}\) is an element of both \(U\) and \(W\text{,}\) we can write two decompositions of \(\vect{x}\) as a vector from \(U\) plus a vector from \(W\text{,}\)

\begin{align*} \vect{x}&=\vect{x}+\zerovector & \vect{x}&=\zerovector+\vect{x} \end{align*}

By the uniqueness of the decomposition, we see (twice) that \(\vect{x}=\zerovector\) and \(U\cap W\subseteq\set{\zerovector}\text{.}\) Applying Definition SE, we have \(U\cap W=\set{\zerovector}\text{.}\)

⇐ Assume that \(U\cap W=\set{\zerovector}\text{.}\) And assume further that \(\vect{v}\in V\) is such that \(\vect{v}=\vect{u}_1+\vect{w}_1\) and \(\vect{v}=\vect{u}_2+\vect{w}_2\) where \(\vect{u}_1,\,\vect{u}_2\in U\text{,}\) \(\vect{w}_1,\,\vect{w}_2\in W\text{.}\) Define \(\vect{x}=\vect{u}_1-\vect{u}_2\text{.}\) then by Property AC, \(\vect{x}\in U\text{.}\) Also

\begin{align*} \vect{x}&=\vect{u}_1-\vect{u}_2\\ &=\left(\vect{v}-\vect{w}_1\right)-\left(\vect{v}-\vect{w}_2\right)\\ &=\left(\vect{v}-\vect{v}\right)-\left(\vect{w}_1-\vect{w}_2\right)\\ &=\vect{w}_2-\vect{w}_1 \end{align*}

So \(\vect{x}\in W\) by Property AC. Thus, \(\vect{x}\in U\cap W =\set{\zerovector}\) (Definition SI). So \(\vect{x}=\zerovector\) and

\begin{align*} \vect{u}_1-\vect{u}_2&=\zerovector & \vect{w}_2-\vect{w}_1&= \zerovector\\ \vect{u}_1&=\vect{u}_2 & \vect{w}_2&=\vect{w}_1 \end{align*}

yielding the desired uniqueness of the second condition of the definition.

If the statement of Theorem Theorem 1.2.5 did not remind you of linear independence, the next theorem should establish the connection.

Let \(R=\set{\vectorlist{u}{k}}\) and \(S=\set{\vectorlist{w}{\ell}}\text{.}\) Begin with a relation of linear dependence (Definition RLD) on the set \(R\cup S\) using scalars \(\scalarlist{a}{k}\) and \(\scalarlist{b}{\ell}\text{.}\) Then,

\begin{align*} \zerovector&=\lincombo{a}{u}{k} + \lincombo{b}{w}{\ell}\\ &=\left(\lincombo{a}{u}{k}\right) + \left(\lincombo{b}{w}{\ell}\right)\\ &=\vect{u}+\vect{w} \end{align*}

where we have made an implicit definition of the vectors \(\vect{u}\in U\text{,}\) \(\vect{w}\in W\text{.}\)

Applying Theorem Theorem 1.2.5 we conclude that

\begin{align*} \vect{u}&=\lincombo{a}{u}{k}=\zerovector\\ \vect{w}&=\lincombo{b}{w}{\ell}=\zerovector \end{align*}

Now the linear independence of \(R\) and \(S\) (individually) yields

\begin{align*} a_1&=a_2=a_3=\cdots=a_k=0 & b_1&=b_2=b_3=\cdots=b_\ell=0 \end{align*}

Forced to acknowledge that only a trivial linear combination yields the zero vector, Definition LI says the set \(R\cup S\) is linearly independent in \(V\text{.}\)

Our last theorem in this collection will go some ways towards explaining the word “sum” in the moniker “direct sum”.

We will establish this equality of positive integers with two inequalities. We will need a basis of \(U\) (call it \(B\)) and a basis of \(W\) (call it \(C\)).

First, note that \(B\) and \(C\) have sizes equal to the dimensions of the respective subspaces. The union of these two linearly independent sets, \(B\cup C\) will be linearly independent in \(V\) by Theorem Theorem 1.2.7. Further, the two bases have no vectors in common by Theorem Theorem 1.2.6, since \(B\cap C\subseteq\set{\zerovector}\) and the zero vector is never an element of a linearly independent set (Exercise LI.T10). So the size of the union is exactly the sum of the dimensions of \(U\) and \(W\text{.}\) By Theorem G the size of \(B\cup C\) cannot exceed the dimension of \(V\) without being linearly dependent. These observations give us \(\dimension{U}+\dimension{W}\leq\dimension{V}\text{.}\)

Grab any vector \(\vect{v}\in V\text{.}\) Then by Theorem Theorem 1.2.6 we can write \(\vect{v}=\vect{u}+\vect{w}\) with \(\vect{u}\in U\) and \(\vect{w}\in W\text{.}\) Individually, we can write \(\vect{u}\) as a linear combination of the basis elements in \(B\text{,}\) and similarly, we can write \(\vect{w}\) as a linear combination of the basis elements in \(C\text{,}\) since the bases are spanning sets for their respective subspaces. These two sets of scalars will provide a linear combination of all of the vectors in \(B\cup C\) which will equal \(\vect{v}\text{.}\) The upshot of this is that \(B\cup C\) is a spanning set for \(V\text{.}\) By Theorem G, the size of \(B\cup C\) cannot be smaller than the dimension of \(V\) without failing to span \(V\text{.}\) These observations give us \(\dimension{U}+\dimension{W}\geq\dimension{V}\text{.}\)

There is a certain appealling symmetry in the previous proof, where both linear independence and spanning properties of the bases are used, both of the first two conclusions of Theorem G are employed, and we have quoted both of the two conditions of Theorem Theorem 1.2.6.

One final theorem tells us that we can successively decompose direct sums into sums of smaller and smaller subspaces.

Suppose that \(\vect{v}\in V\text{.}\) Then due to \(V=U\ds W\text{,}\) there exist vectors \(\vect{u}\in U\) and \(\vect{w}\in W\) such that \(\vect{v}=\vect{u}+\vect{w}\text{.}\) Due to \(W=X\ds Y\text{,}\) there exist vectors \(\vect{x}\in X\) and \(\vect{y}\in Y\) such that \(\vect{w}=\vect{x}+\vect{y}\text{.}\) All together,

\begin{equation*} \vect{v}=\vect{u}+\vect{w}=\vect{u}+\vect{x}+\vect{y} \end{equation*}

which would be the first condition of a definition of a 3-way direct product.

Now consider the uniqueness. Suppose that

\begin{align*} \vect{v}&=\vect{u}_1+\vect{x}_1+\vect{y}_1 & \vect{v}&=\vect{u}_2+\vect{x}_2+\vect{y}_2 \end{align*}

Because \(\vect{x}_1+\vect{y}_1\in W\text{,}\) \(\vect{x}_2+\vect{y}_2\in W\text{,}\) and \(V=U\ds W\text{,}\) we conclude that

\begin{align*} \vect{u}_1&=\vect{u}_2 & \vect{x}_1+\vect{y}_1&=\vect{x}_2+\vect{y}_2 \end{align*}

From the second equality, an application of \(W=X\ds Y\) yields the conclusions \(\vect{x}_1=\vect{x}_2\) and \(\vect{y}_1=\vect{y}_2\text{.}\) This establishes the uniqueness of the decomposition of \(\vect{v}\) into a sum of vectors from \(U\text{,}\) \(X\) and \(Y\text{.}\)

Remember that when we write \(V=U\ds W\) there always needs to be a “superspace,” in this case \(V\text{.}\) The statement \(U\ds W\) is meaningless. Writing \(V=U\ds W\) is simply a shorthand for a somewhat complicated relationship between \(V\text{,}\) \(U\) and \(W\text{,}\) as described in the two conditions of Definition Definition 1.2.1, or Theorem Theorem 1.2.5, or Theorem Theorem 1.2.6. Theorem Theorem 1.2.3 and Theorem Theorem 1.2.4 gives us sure-fire ways to build direct sums, while Theorem Theorem 1.2.7, Theorem Theorem 1.2.8 and Theorem Theorem 1.2.9 tell us interesting properties of direct sums.

This subsection has been long on theorems and short on examples. If we were to use the term “lemma” we might have chosen to label some of these results as such, since they will be important tools in other proofs, but may not have much interest on their own (see Proof Technique LC). We will be referencing these results heavily in later sections, and will remind you then to come back for a second look.