Section 1.5 Reflectors
When we decompose matrices into products of other matrices, we often wish to create matrices with many zero entries. A Householder matrix is a unitary matrix which transforms a vector into a vector of the same size that is a nonzero multiple of the first column of an identity matrix, thus creating a vector with just a single nonzero entry. A typical application is to “zero out” entries of a matrix below the diagonal, column-by-column, in order to achieve a triangular matrix.
Definition 1.5.1.
Given a nonzero vector \(\vect{v}\in\complex{n}\text{,}\) the Householder matrix for \(\vect{v}\) is
The vector \(\vect{v}\) is called the Householder vector.
A Householder matrix is both Hermitian and unitary.
Lemma 1.5.2.
The Householder matrix for \(\vect{v}\in\complex{n}\) is Hermitian.
Proof.
Lemma 1.5.3.
The Householder matrix for \(\vect{v}\in\complex{n}\) is unitary.
Proof.
Our aim with a Householder matrix is to convert a vector \(\vect{x}\) into a scalar multiple of the first column of the identity matrix, \(\vect{e}_1\text{.}\) Which Householder vector should we choose for constructing such a matrix, and which multiple will we get? It is an instructive exercise to reverse-engineer the choice by setting \(P\vect{x}=\alpha\vect{e}_1\) (Exercise Checkpoint 1.5.9). Instead, we give the answer and prove that it does the desired job.
Theorem 1.5.4.
Given a vector \(\vect{x}\in\real{n}\text{,}\) define \(\vect{v}=\vect{x}\pm\norm{\vect{x}}\vect{e}_1\) and let \(P\) be the Householder matrix for the Householder vector \(\vect{v}\text{.}\) Then \(P\vect{x}=\mp\norm{\vect{x}}\vect{e}_1\text{.}\)
Proof.
Example 1.5.5.
Consider the vector \(\vect{x}\) and construct the Householder vector \(\vect{v}\text{.}\)
Then the Householder matrix for \(\vect{v}\) is
We can check that the matrix behaves as we expect.
A Householder matrix is often called a Householder reflection. Why? Any Householder matrix, when thought of as a mapping of points in a physical space, will fix the elements of a hyperplane and reflect any other points about that hyperplane. To see this, consider any vector \(\vect{w}\) and compare it with its image, \(P\vect{w}\)
So this difference is always a scalar multiple of the Householder vector \(\vect{v}\text{,}\) and thus every point gets moved in the same direction, the direction described by \(\vect{v}\text{.}\) Which points are fixed by \(P\text{?}\) The difference above is the zero vector precisely when the scalar multiple is zero, which will happen when \(\innerproduct{\vect{v}}{\vect{w}}=0\text{.}\) So the set of points/vectors which are orthogonal to \(\vect{v}\) will be unmoved. This is a subspace of dimension one less than the full space, which are typically described by the term hyperplanes.
To be more specific, consider the specific situation of Example Example 1.5.5, viewed in \(\real{3}\text{.}\) The hyperplane is the subspace orthogonal to \(\vect{v}\text{,}\) or the two-dimensional plane with \(\vect{v}\) as its normal vector, and equation \(-5x+4y+7z=0\text{.}\) The points \((4,4,7)\) and \((9,0,0)\) lie on either side of the plane and are a reflection of each other in the plane, by which we mean the vector \((4,4,7)-(9,0,0)=(-5,4,7)\) is perpendicular (orthogonal, normal) to the plane.
Our choice of \(\vect{v}\) can be replaced by \(\vect{v}=\vect{x}+\norm{\vect{x}}\vect{e}_1\text{,}\) so in the previous example we would have \(\vect{v}=\colvector{13\\4\\7}\text{,}\) and then \(P\) would take \(\vect{x}\) to \(\colvector{-9\\0\\0}\text{.}\) This would be a reflection across the plane with equation \(13x+4y+7z=0\text{.}\) Notice that the normal vector of this plane is orthogonal to the normal vector of the previous plane, which is not an accident (Exercise Checkpoint 1.5.6).
As a practical matter, we would choose the Householder vector which moves \(\vect{x}\) the furthest, to get better numerical behavior. So in our example above, the second choice would be better, since \(\vect{x}\) will be moved a distance \(2\norm{\vect{v}}\) and the second \(\vect{v}\) has a larger norm.
Checkpoint 1.5.6.
In the real case, we have two choices for a Householder vector which will “zero out” most of a vector. Show that these two vectors, \(\vect{x}+\norm{\vect{x}}\vect{e}_1\) and \(\vect{x}-\norm{\vect{x}}\vect{e}_1\text{,}\) are orthogonal to each other.
Checkpoint 1.5.7.
Prove the following generalization of Theorem Theorem 1.5.4. Given a vector \(\vect{x}\in\complex{n}\text{,}\) define \(\rho=\vectorentry{\vect{x}}{1}/\modulus{\vectorentry{\vect{x}}{1}}\) and \(\vect{v}=\vect{x}\pm\rho\norm{\vect{x}}\vect{e}_1\) and let \(P\) be the Householder matrix for the Householder vector \(\vect{v}\text{.}\) Then \(P\vect{x}=\mp\rho\norm{\vect{x}}\vect{e}_1\text{.}\)
You can establish the same identity as in the first part of the proof of Theorem Theorem 1.5.4.
Checkpoint 1.5.8.
Suppose that \(P\) is a Householder matrix of size \(n\) and \(\vect{b}\in\complex{n}\) is any vector. Find an expression for the matrix-vector product \(P\vect{b}\) which will suggest a way to compute this vector with fewer than the \(\orderof{2n^2}\) operations required for a general matrix-vector product.
Checkpoint 1.5.9.
Begin with the condition that a Householder matrix will accomplish \(P\vect{x}=\alpha\vect{e}_1\) and “discover” the choice for the Householder vector described in Theorem Theorem 1.5.4. Hint: The condition implies that the Householder vector \(\vect{v}\) is in the span of \(\set{\vect{x}, \vect{e}_1}\text{.}\)