Section 2.1 LU (Triangular) Decomposition
The LU decomposition begins with any matrix and describes it as a product of a lower-triangular matrix (\(L\)) and an upper-triangular matrix (\(U\)). Hence the customary shorthand name, “LU”. The term triangular decomposition might be more evocative, if not more verbose.
You will notice that the LU decomposition feels very much like reduced row-echelon form, and in some ways could be considered an improvement. Triangular matrices are defined in Subsection Subsection OD.TM and two basic facts are that the product of two triangular matrices “of the same type” (i.e. both upper or both lower) is again of that type (Theorem PTMT) and the inverse of a nonsingular triangular matrix will be triangular of the same type (Theorem ITMT).
Subsection 2.1.1 LU Decomposition, Nonsingular Case
Theorem 2.1.1. LU (Triangular) Decomposition.
Suppose \(A\) is a square matrix of size \(n\text{.}\) Let \(A_k\) be the \(k\times k\) matrix formed from \(A\) by taking the first \(k\) rows and the first \(k\) columns. Suppose that \(A_k\) is nonsingular for all \(1\leq k\leq n\text{.}\) Then there is a lower triangular matrix \(L\) with all of its diagonal entries equal to 1 and an upper triangular matrix \(U\) such that \(A=LU\text{.}\) Furthermore, this decomposition is unique.
Proof.
We will row reduce \(A\) to a row-equivalent upper triangular matrix through a series of row operations, forming intermediate matrices \(A^\prime_j\text{,}\) \(1\leq j\leq n\text{,}\) that denote the state of the conversion after working on column \(j\text{.}\) First, the lone entry of \(A_1\) is \(\matrixentry{A}{11}\) and this scalar must be nonzero if \(A_1\) is nonsingular (Theorem SMZD). We can use row operations Definition RO of the form \(\rowopadd{\alpha}{1}{k}\text{,}\) \(2\leq k\leq n\text{,}\) where \(\alpha=-\matrixentry{A}{1k}/\matrixentry{A}{11}\) to place zeros in the first column below the diagonal. The first two rows and columns of \(A^\prime_1\) are a \(2\times 2\) upper triangular matrix whose determinant is equal to the determinant of \(A_2\text{,}\) since the matrices are row-equivalent through a sequence of row operations strictly of the third type (Theorem DRCMA). As such the diagonal entries of this \(2\times 2\) submatrix of \(A^\prime_1\) are nonzero. We can employ this nonzero diagonal element with row operations of the form \(\rowopadd{\alpha}{2}{k}\text{,}\) \(3\leq k\leq n\) to place zeros below the diagonal in the second column. We can continue this process, column by column. The key observations are that our hypothesis on the nonsingularity of the \(A_k\) will guarantee a nonzero diagonal entry for each column when we need it, that the row operations employed are always of the third type using a multiple of a row to transform another row with a greater row index, and that the final result will be a nonsingular upper triangular matrix. This is the desired matrix \(U\text{.}\)
Each row operation described in the previous paragraph can be accomplished with matrix multiplication by the appropriate elementary matrix (Theorem EMDRO). Since every row operation employed is adding a multiple of a row to a subsequent row these elementary matrices are of the form \(\elemadd{\alpha}{j}{k}\) with \(j<k\text{.}\) By Definition ELEM, these matrices are lower triangular with every diagonal entry equal to 1. We know that the product of two such matrices will again be lower triangular (Theorem PTMT), but also, as you can also easily check using a proof with a style similar to one above, that the product maintains all 1's on the diagonal. Let \(E_1,\,E_2,\,E_3,\,\dots,\,E_m\) denote the elementary matrices for this sequence of row operations. Then
where \(L^\prime\) is the product of the elementary matrices, and we know \(L^\prime\) is lower triangular with all 1's on the diagonal. Our desired matrix \(L\) is then \(L=\inverse{\left(L^\prime\right)}\text{.}\) By Theorem ITMT, \(L\) is lower triangular with all 1's on the diagonal and \(A=LU\text{,}\) as desired.
The process just described is deterministic. That is, the proof is constructive, with no freedom for each of us to walk through it differently. But could there be other matrices with the same properties as \(L\) and \(U\) that give such a decomposition of \(A\text{?}\) In other words, is the decomposition unique (Proof Technique U)? Suppose that we have two triangular decompositions, \(A=L_1U_1\) and \(A=L_2U_2\text{.}\) Since \(A\) is nonsingular, two applications of Theorem NPNT imply that \(L_1,\,L_2,\,U_1,\,U_2\) are all nonsingular. We have
Theorem ITMT tells us that \(\inverse{L_2}\) is lower triangular and has 1's as the diagonal entries. By Theorem PTMT, the product \(\inverse{L_2}L_1\) is again lower triangular, and it is simple to check (as before) that the diagonal entries of the product are again all 1's. By the entirely similar process we can conclude that the product \(U_2\inverse{U_1}\) is upper triangular. Because these two products are equal, their common value is a matrix that is both lower triangular and upper triangular, with all 1's on the diagonal. The only matrix meeting these three requirements is the identity matrix (Definition IM). So, we have,
which establishes the uniqueness of the decomposition.
Studying the proofs of some previous theorems will perhaps give you an idea for an approach to computing a triangular decomposition. In the proof of Theorem CINM we augmented a nonsingular matrix with an identity matrix of the same size, and row-reduced until the original matrix became the identity matrix (as we knew in advance would happen, since we knew Theorem NMRRI). Theorem PEEF tells us about properties of extended echelon form, and in particular, that \(B=JA\text{,}\) where \(A\) is the matrix that begins on the left, and \(B\) is the reduced row-echelon form of \(A\text{.}\) The matrix \(J\) is the result on the right side of the augmented matrix, which is the result of applying the same row operations to the identity matrix. We should recognize now that \(J\) is just the product of the elementary matrices (Subsection DM.EM) that perform these row operations. Theorem ITMT used the extended echelon form to discern properties of the inverse of a triangular matrix. Theorem TD proves the existence of a triangular decomposition by applying specific row operations, and tracking the relevant elementary row operations. It is not a great leap to combine these observations into a computational procedure.
To find the triangular decomposition of \(A\text{,}\) augment \(A\) with the identity matrix of the same size and call this new \(2n\times n\) matrix, \(M\text{.}\) Perform row operations on \(M\) that convert the first \(n\) columns to an upper triangular matrix. Do this using only row operations that add a scalar multiple of one row to another row with higher index (i.e. lower down). In this way, the last \(n\) columns of \(M\) will be converted into a lower triangular matrix with 1's on the diagonal (since \(M\) has 1's in these locations initially). We could think of this process as doing about half of the work required to compute the inverse of \(A\text{.}\) Take the first \(n\) columns of the row-equivalent version of \(M\) and call this matrix \(U\text{.}\)
Take the final \(n\) columns of the row-equivalent version of \(M\) and call this matrix \(L^\prime\text{.}\) Then by a proof employing elementary matrices, or a proof similar in spirit to the one used to prove Theorem PEEF, we arrive at a result similar to the second assertion of Theorem PEEF. Namely, \(U=L^\prime A\text{.}\) Multiplication on the left, by the inverse of \(L^\prime\text{,}\) will give us a decomposition of \(A\) (which we know to be unique). Ready? Lets try it.
Example 2.1.2. Triangular decomposition, size 4.
In this example, we will illustrate the process for computing a triangular decomposition, as described in the previous paragraphs. Consider the nonsingular square matrix \(A\) of size 4,
We form \(M\) by augmenting \(A\) with the size 4 identity matrix \(I_4\text{.}\) We will perform the allowed operations, column by column, only reporting intermediate results as we finish converting each column. It is easy to determine exactly which row operations we perform, since the final four columns contain a record of each such operation. We will not verify our hypotheses about the nonsingularity of the \(A_k\text{,}\) since if we do not have these conditions, we will reach a stage where a diagonal entry is zero and we cannot create the row operations we need to zero out the bottom portion of the associated column. In other words, we can boldly proceed and the necessity of our hypotheses will become apparent.
So at this point, we have \(U\) and \(L^\prime\text{,}\)
Then by whatever procedure we like (such as Theorem CINM), we find
It is instructive to verify that indeed \(LU=A\text{.}\)
Subsection 2.1.2 Solving Systems with Triangular Decompositions
In this section we give an explanation of why you might be interested in a triangular decomposition for a matrix. Many of the computational problems in linear algebra revolve around solving large systems of equations, or nearly equivalently, finding inverses of large matrices. Suppose we have a system of equations with coefficient matrix \(A\) and vector of constants \(\vect{b}\text{,}\) and suppose further that \(A\) has the triangular decomposition \(A=LU\text{.}\)
Let \(\vect{y}\) be the solution to the linear system \(\linearsystem{L}{\vect{b}}\text{,}\) so that by Theorem SLEMM, we have \(L\vect{y}=\vect{b}\text{.}\) Notice that since \(L\) is nonsingular, this solution is unique, and the form of \(L\) makes it trivial to solve the system. The first component of \(\vect{y}\) is determined easily, and we can continue on through determining the components of \(\vect{y}\text{,}\) without even ever dividing. Now, with \(\vect{y}\) in hand, consider the linear system, \(\linearsystem{U}{\vect{y}}\text{.}\) Let \(\vect{x}\) be the unique solution to this system, so by Theorem SLEMM we have \(U\vect{x}=\vect{y}\text{.}\) Notice that a system of equations with \(U\) as a coefficient matrix is also straightforward to solve, though we will compute the bottom entries of \(\vect{x}\) first, and we will need to divide. The upshot of all this is that \(\vect{x}\) is a solution to \(\linearsystem{A}{\vect{b}}\text{,}\) as we now show,
An application of Theorem SLEMM demonstrates that \(\vect{x}\) is a solution to \(\linearsystem{A}{\vect{b}}\text{.}\)
Example 2.1.3. Triangular decomposition solves a system of equations.
Here we illustrate the previous discussion, recycling the decomposition found previously in Example TD4. Consider the linear system \(\linearsystem{A}{\vect{b}}\) with
First we solve the system \(\linearsystem{L}{\vect{b}}\) (see Example TD4 for \(L\)),
Then
so
Then we solve the system \(\linearsystem{U}{\vect{y}}\) (see Example TD4 for \(U\)),
Then
And so
is the solution to \(\linearsystem{U}{\vect{y}}\) and consequently is the unique solution to \(\linearsystem{A}{\vect{b}}\text{,}\) as you can easily verify.
Subsection 2.1.3 Computing Triangular Decompositions
It would be a simple matter to adjust the algorithm for converting a matrix to reduced row-echelon form and obtain an algorithm to compute the triangular decomposition of the matrix, along the lines of Example TD4 and the discussion preceding this example. However, it is possible to obtain relatively simple formulas for the entries of the decomposition, and if computed in the proper order, an implementation will be straightforward. We will state the result as a theorem and then give an example of its use.
Theorem 2.1.4. Triangular Decomposition, Entry by Entry.
Suppose that \(A\) is a square matrix of size \(n\) with a triangular decomposition \(A=LU\text{,}\) where \(L\) is lower triangular with diagonal entries all equal to 1, and \(U\) is upper triangular. Then
Proof.
At first glance, these formulas may look exceedingly complex. Upon closer examination, it looks even worse. We have expressions for entries of \(U\) that depend on other entries of \(U\) and also on entries of \(L\text{.}\) But then the formula for entries of \(L\) depend on entries from \(L\) and entries from \(U\text{.}\) Do these formula have circular dependencies? Or perhaps equivalently, how do we get started? The key is to be organized about the computations and employ these two (similar) formulas in a specific order. First compute the first row of \(L\text{,}\) followed by the first column of \(U\text{.}\) Then the second row of \(L\text{,}\) followed by the second column of \(U\text{.}\) And so on. In this way, all of the values required for each new entry will have already been computed previously.
Of course, the formula for entries of \(L\) require division by diagonal entries of \(U\text{.}\) These entries might be zero, but in this case \(A\) is nonsingular and does not have a triangular decomposition. So we need not check the hypothesis carefully and can launch into the arithmetic dictated by the formulas, confident that we will be reminded when a decomposition is not possible. Note that these formula give us all of the values that we need for the decomposition, since we require that \(L\) has 1's on the diagonal. If we replace the 1's on the diagonal of \(L\) by zeros, and add the matrix \(U\text{,}\) we get an \(n\times n\) matrix containing all the information we need to resurrect the triangular decomposition. This is mostly a notational convenience, but it is a frequent way of presenting the information. We'll employ it in the next example.
Example 2.1.5. Triangular decomposition, entry by entry, size 6.
We illustrate the application of the formulas in Theorem TDEE for the \(6\times 6\) matrix \(A\text{.}\)
Using the notational convenience of packaging the two triangular matrices into one matrix, and using the ordering of the computations mentioned above, we display the results after computing a single row and column of each of the two triangular matrices.
Splitting out the pieces of this matrix, we have the decomposition,
Subsection 2.1.4 Triangular Decomposition with Pivoting
The hypotheses of Theorem TD can be weakened slightly to include matrices where not every \(A_k\) is nonsingular. The introduces a rearrangement of the rows and columns of \(A\) to force as many as possible of the smaller submatrices to be nonsingular. Then permutation matrices also enter into the decomposition. We will not present the details here, but instead suggest consulting a more advanced text on matrix analysis.