SCLA Vandermonde Matrices

Section 5.1 Vandermonde Matrices

Alexandre-Th\'{e}ophile Vandermonde was a French mathematician in the 1700's who was among the first to write about basic properties of the determinant (such as the effect of swapping two rows). However, the determinant that bears his name (Theorem Theorem 5.1.3) does not appear in any of his four published mathematical papers.

Definition 5.1.1. Vandermonde Matrix.

A square matrix of size \(n\text{,}\) \(A\text{,}\) is a Vandermonde matrix if there are scalars, \(\scalarlist{x}{n}\) such that \(\matrixentry{A}{ij}=x_{i}^{j-1}\text{,}\) \(1\leq i\leq n\text{,}\) \(1\leq j\leq n\text{.}\)

The matrix

\begin{equation*} A= \begin{bmatrix} 1 & 2 & 4 & 8\\ 1 & -3 & 9 & -27\\ 1 & 1 & 1 & 1\\ 1 & 4 & 16 & 64 \end{bmatrix} \end{equation*}

is a Vandermonde matrix since it meets the definition with \(x_1=2\text{,}\) \(x_2=-3\text{,}\) \(x_3=1\text{,}\) \(x_4=4\text{.}\)

Vandermonde matrices are not very interesting as numerical matrices, but instead appear more often in proofs and applications where the scalars \(x_i\) are carried as symbols. Two such applications are in the sections on secret-sharing ([provisional cross-reference: section-SAS]) and curve-fitting (Section 4.2). Principally, we would like to know when Vandermonde matrices are nonsingular, and the most convenient way to check this is by determining when the determinant is nonzero (Theorem SMZD). As a bonus, the determinant of a Vandermonde matrix has an especially pleasing formula.

Theorem 5.1.3. Determinant of a Vandermonde Matrix.

Suppose that \(A\) is a Vandermonde matrix of size \(n\) built with the scalars \(\scalarlist{x}{n}\text{.}\) Then

\begin{equation*} \detname{A}=\prod_{1\leq i\lt j\leq n}\left(x_j-x_i\right). \end{equation*}

The proof is by induction (Proof Technique I) on \(n\text{,}\) the size of the matrix. An empty product for a \(1\times 1\) matrix might make a good base case, but we'll start at \(n=2\) instead. For a \(2\times 2\) Vandermonde matrix, we have

\begin{equation*} \detname{A}=\begin{vmatrix} 1&x_1\\1&x_2 \end{vmatrix} =x_2-x_1 =\prod_{1\leq i\lt j\leq 2}\left(x_j-x_i\right) \end{equation*}

For the induction step we will perform row operations on \(A\) to obtain the determinant of \(A\) as a multiple of the determinant of an \((n-1)\times(n-1)\) Vandermonde matrix. The notation in this theorem tends to obscure your intuition about the changes effected by various row and column manipulations. Construct a \(4\times 4\) Vandermonde matrix with four symbols as the scalars (\(x_1\text{,}\) \(x_2\text{,}\) \(x_2\text{,}\) \(x_4\text{,}\) or perhaps \(a\text{,}\) \(b\text{,}\) \(c\text{,}\) \(d\)) and play along with the example as you study the proof.

First we convert most of the first column to zeros. Subtract row \(n\) from each of the other \(n-1\) rows to form a matrix \(B\text{.}\) By Theorem DRCMA, \(B\) has the same determinant as \(A\text{.}\) The entries of \(B\text{,}\) in the first \(n-1\) rows, i.e. for \(1\leq i\leq n-1\text{,}\) \(1\leq j\leq n-1\text{,}\) are

\begin{equation*} \matrixentry{B}{ij}=x_{i}^{j-1}-x_{n}^{j-1}= \left(x_i-x_n\right)\sum_{k=0}^{j-2} x_{i}^{j-2-k}x_{n}^{k} \end{equation*}

As the elements of row \(i\text{,}\) \(1\leq i\leq n-1\text{,}\) have the common factor \(\left(x_i-x_n\right)\text{,}\) we form the new matrix \(C\) that differs from \(B\) by the removal of this factor from each of the first \(n-1\) rows. This will change the determinant, as we will track carefully in a moment. We also have a first column with zeros in each location, except row \(n\text{,}\) so we can use it for a column expansion computation of the determinant. We now know,

\begin{align*} \detname{A}&=\detname{B} &&\\ &=(x_1-x_n)(x_2-x_n)\cdots(x_{n-1}-x_n)\detname{C} &&&\\ &=(x_1-x_n)(x_2-x_n)\cdots(x_{n-1}-x_n)(1)(-1)^{n+1}\detname{\submatrix{C}{n-1}{1}} &&\\ &=(x_1-x_n)(x_2-x_n)\cdots(x_{n-1}-x_n)(-1)^{n-1}\detname{\submatrix{C}{n-1}{1}}\\ &=(x_n-x_1)(x_n-x_2)\cdots(x_n-x_{n-1})\detname{\submatrix{C}{n-1}{1}} \end{align*}

For convenience, denote \(D=\submatrix{C}{n-1}{1}\text{.}\) Entries of this matrix are similar to those of \(B\text{,}\) but the factors used to build \(C\) are gone, and since the first column is gone, there is a slight re-indexing relative to the columns. For \(1\leq i\leq n-1\text{,}\) \(1\leq j\leq n-1\text{,}\)

\begin{equation*} \matrixentry{D}{ij}=\sum_{k=0}^{j-1} x_{i}^{j-1-k}x_{n}^{k} \end{equation*}

We will perform many column operations on the matrix \(D\text{,}\) always of the type where we multiply a column by a scalar and add the result to another column. As such, Theorem DRCM insures that the determinant will remain constant. We will work column by column, left to right, to convert \(D\) into a Vandermonde matrix with scalars \(\scalarlist{x}{n-1}\text{.}\) More precisely, we will build a sequence of matrices \(D=D_1\text{,}\) \(D_2\text{,}\) \dots, \(D_{n-1}\text{,}\) where each is obtainable from the previous by a sequence of determinant-preserving column operations and the first \(\ell\) columns of \(D_\ell\) are the first \(\ell\) columns of a Vandermonde matrix with scalars \(\scalarlist{x}{n-1}\text{.}\) We could establish this claim by induction (Proof Technique I) on \(\ell\) if we were to expand the claim to specify the exact values of the final \(n-1-\ell\) columns as well. Since the claim is that matrices with certain properties exist, we will instead establish the claim by constructing the desired matrices one-by-one procedurally. The extension to an inductive proof should be clear, but not especially illuminating.

Set \(D_1=D\) to begin, and note that the entries of the first column of \(D_1\) are, for \(1\leq i\leq n-1\text{,}\)

\begin{equation*} \matrixentry{D_1}{i1}=\sum_{k=0}^{1-1} x_{i}^{1-1-k}x_{n}^{k}=1=x_{i}^{1-1} \end{equation*}

So the first column of \(D_1\) has the properties we desire. We will use this column of all 1's to remove the highest power of \(x_n\) from each of the remaining columns and so build \(D_2\text{.}\) Precisely, perform the \(n-2\) column operations where column 1 is multiplied by \(x_n^{j-1}\) and subtracted from column \(j\text{,}\) for \(2\leq j\leq n-1\text{.}\) Call the result \(D_2\text{,}\) and examine its entries in columns \(2\) through \(n-1\text{.}\) For \(1\leq i\leq n-1\text{,}\) \(2\leq j\leq n-1\text{,}\)

\begin{align*} \matrixentry{D_2}{ij}&=-x_n^{j-1}\matrixentry{D_1}{i1}+\matrixentry{D_1}{ij}\\ &=-x_n^{j-1}(1)+\sum_{k=0}^{j-1} x_{i}^{j-1-k}x_{n}^{k}\\ &=-x_n^{j-1}+x_{i}^{j-1-(j-1)}x_{n}^{j-1}+\sum_{k=0}^{j-2} x_{i}^{j-1-k}x_{n}^{k}\\ &=\sum_{k=0}^{j-2} x_{i}^{j-1-k}x_{n}^{k} \end{align*}

In particular, we examine column 2 of \(D_2\text{.}\) For \(1\leq i\leq n-1\text{,}\)

\begin{equation*} \matrixentry{D_2}{i2}=\sum_{k=0}^{2-2} x_{i}^{2-1-k}x_{n}^{k}=x_{i}^{1}=x_{i}^{2-1} \end{equation*}

Now, form \(D_3\text{.}\) Perform the \(n-3\) column operations where column 2 of \(D_2\) is multiplied by \(x_n^{j-2}\) and subtracted from column \(j\text{,}\) for \(3\leq j\leq n-1\text{.}\) The result is \(D_3\text{,}\) whose entries we now compute. For \(1\leq i\leq n-1\text{,}\)

\begin{align*} \matrixentry{D_3}{ij}&=-x_n^{j-2}\matrixentry{D_2}{i2}+\matrixentry{D_2}{ij}\\ &=-x_n^{j-2}x_{i}^{1}+\sum_{k=0}^{j-2} x_{i}^{j-1-k}x_{n}^{k}\\ &=-x_n^{j-2}x_{i}^{1}+ x_{i}^{j-1-(j-2)}x_{n}^{j-2}+\sum_{k=0}^{j-3} x_{i}^{j-1-k}x_{n}^{k}\\ &=\sum_{k=0}^{j-3} x_{i}^{j-1-k}x_{n}^{k} \end{align*}

Specifically, we examine column 3 of \(D_3\text{.}\) For \(1\leq i\leq n-1\text{,}\)

\begin{align*} \matrixentry{D_3}{i3}&=\sum_{k=0}^{3-3} x_{i}^{3-1-k}x_{n}^{k}\\ &=x_{i}^{2}=x_{i}^{3-1} \end{align*}

We could continue this procedure \(n-4\) more times, eventually totaling \(\frac{1}{2}\left(n^2-3n+2\right)\) column operations, and arriving at \(D_{n-1}\text{,}\) the Vandermonde matrix of size \(n-1\) built from the scalars \(\scalarlist{x}{n-1}\text{.}\) Informally, we chop off the last term of every sum, until a single term is left in a column, and it is of the right form for the Vandermonde matrix. This desired column is then used in the next iteration to chop off some more final terms for columns to the right. Now we can apply our induction hypothesis to the determinant of \(D_{n-1}\) and arrive at an expression for \(\det{A}\text{,}\)

\begin{align*} \detname{A}&=\detname{C}\\ &=\prod_{k=1}^{n-1}\left(x_n-x_k\right)\detname{D}\\ &=\prod_{k=1}^{n-1}\left(x_n-x_k\right)\detname{D_{n-1}}\\ &=\prod_{k=1}^{n-1}\left(x_n-x_k\right)\prod_{1\leq i\lt j\leq n-1}\left(x_j-x_i\right)\\ &=\prod_{1\leq i \lt j\leq n}\left(x_j-x_i\right) \end{align*}

which is the desired result.

Before we had Theorem Theorem 5.1.3 we could see that if two of the scalar values were equal, then the Vandermonde matrix would have two equal rows and hence be singular (Theorem DERC, Theorem SMZD). But with this expression for the determinant, we can establish the converse.

Theorem 5.1.4. Nonsingular Vandermonde Matrix.

A Vandermonde matrix of size \(n\) with scalars \(\scalarlist{x}{n}\) is nonsingular if and only if the scalars are all different.

Let \(A\) denote the Vandermonde matrix with scalars \(\scalarlist{x}{n}\text{.}\) By Theorem SMZD, \(A\) is nonsingular if and only if the determinant of \(A\) is nonzero. The determinant is given by Theorem Theorem 5.1.3, and this product is nonzero if and only if each term of the product is nonzero. This condition translates to \(x_i-x_j\neq 0\) whenever \(i\neq j\text{.}\) In other words, the matrix is nonsingular if and only if the scalars are all different.

A Second Course in Linear Algebra

Section 5.1 Vandermonde Matrices

Definition 5.1.1. Vandermonde Matrix.

Example 5.1.2. Vandermonde matrix of size 4.

Theorem 5.1.3. Determinant of a Vandermonde Matrix.

Proof.

Theorem 5.1.4. Nonsingular Vandermonde Matrix.

Proof.