next up previous contents
Next: Sturm sequences Up: Band Matrices Previous: Band Matrices   Contents

Band matrices

Band matrices arise from a variety of sources; for example from discretizing a linear ordinary differential equation, from the classical equations of motion of a string of masses coupled by Hooke'sHooke's law law springs, or the study of the electrical characteristics of a lumped transmission line, the quantum mechanical study of pi electrons in linear organic molecules, and quite a goodly assortment of other places. For numerical analyses, straightforward procedures will reduce a matrix to band form, following which one of a variety of iterative procedures can be set in motion.

Diagonalizing a band matrix is particularly interesting for theoretical purposes because the procedure can be transformed into solving a recursion relation using supplementary matrices no larger than the width of the band, no matter how large to starting matrix. To see how it can all be done, imagine a tridiagonaltridiagonal matrix matrix $A$ (band width three, centered) of the form

\begin{displaymath}A = \left[ \begin{array}{lllllll}
a_{11} & a_{12} & . & . &...
...5} & . & . \\
. & . & . & . & . & . & .
\end{array} \right]. \end{displaymath}

with its eigenvalue equation $A X = \lambda X$. For any one component the equation would read

\begin{eqnarray*}
a_{i,i-1} x_{i-1} + a_{i,i} x_i + a_{i,i+1} x_{i+1} & = & \lambda x_i,
\end{eqnarray*}



which could be rearranged to the form

\begin{eqnarray*}
a_{i,i+1} x_{i+1} & = & (\lambda - a_{i,i}) x_i - a_{i,i-1} x_{i-1} ,
\end{eqnarray*}



and recognized as the first half of a $2x2$ matrix equation, whose second half consists of the tautology $x_i = x_i$:

\begin{eqnarray*}
\left[ \begin{array}{l}
x_{i+1} \\ x_i \end{array} \right] &...
...
\left[ \begin{array}{l}
x_i \\ x_{i-1} \end{array} \right].
\end{eqnarray*}



For ease of reference, suppose that this equation says $M Y_{i+1} = Y_i$. Any high-indexed component can be derived from one with a lower index by successive substitution, which translates into calculating a product of $2x2$ matrices. There is always the question of how to start and stop, which in turn depends on some alternative structures for the matrix $A$. The extreme skewdiagonal corners could be non-zero without violating the bandwidth hypothesis if the system described by the matrix were cyclical, in which case it would be supposed that some power of $M$ were the unit matrix.

Otherwise fictitions components $x_0$ and $x_{n+1}$ could be introduced, obligated to vanish, and the overall symmetry of $A$ preserved by inserting arbitrary corner elements, for example ones which would preserve the integrity of the diagonal in which they sit.

Several conclusions can be drawn at once. The matrix $M$ ought to be defined and invertible. Its definition supposes nonzero $a_{i,i+1}$, so the righthand edge of the band must be intact. For it to be invertible, its determinant $a_{i,i-1}/a_{i,i+1}$ ought not vanish, placing a similar requirement on the left hand edge. Once those two conditions are realized, all possible consecutive pairs of components can be deduced from any single one of them, for example, a terminal pair.

The next observation is: whenever two consecutive components are zero, all components vanish, contrary to the assumption of non-zero eigenvectors. In applications to vibrating strings, places where the displacement vanishes at all times are called nodes, implying that nodes contain no more than a single particle.

A different matrix $M_i$ would arise from each row of $A$, not excluding that some of them might be numerically equal if $A$ had some similar rows. In fact, that is one of the most interesting cases, since it would imply a physical problem derived from connecting a series of similar units. The eigenvalue of $A$ is a parameter in these matrices, resulting in matrix elements which were polynomials in the eigenvalue after a series of $M$'s were multiplied together. To give $M$ or its products a name, they are often called a transfer matrices, because they relate components in one place to those in another. Nor should it be surprising to find the characteristic polynomial of A amongst the matrix elements of an all-encompassing transfer matrix running from one of the string to the other. Depending on the context from which A arose, its eigenvalues (the roots of the characteristic equation) would be frequencies (squares of frequencies, actually), decay rates, energy levels, or whatever.

Ease in calculating transfer matrices would reduce the labor calculating components of eigenvectors, especially that part relating the extreme components which reveals the characteristic polynomial. Calculation with matrices often reduces to diagonalizing them to get their eigenvector basis, creating an entirely new eigenvalue problem for the matrix $M$ instead of the matrix $A$, namely finding $Y$ for which $M Y = \mu Y$. Once done, we end up writing the components of $X$ as linear combinations of the eigenvectors of $Y$. Because of the liklihood that $A$ has something to do with vibrations, eigenvectors of $M$ are called waves, its eigenvalues wave numbers, and its characteristic polynomial a dispersion relation. That is because it influences how fast a wave of a given frequency travels, and the fact that they usually don't all travel at the same speed.

Although it is theoretically possible to bring any symmetric matrix to tridiagonal form (for general matrices it is called a hessenberg form, which only has to be banded on one side of the diagonal), there are advantages in not doing so if there is a polydiagonal form with appreciable regularity. That would happen from the beginning if the matrix resulted from discretizing a higher order (than second) differential equation or maybe from studying the vibration of a chain of coupled masses with long range coupling.

The principal differences between tridiagonal and polydiagonal matrices lie in the maximum width of a node, and in the complexity of the dispersion relation, which would be a polynomial of ever increasing degree. And of course, the number of waves - basis vectors for the transfer matrix - grows with the width of the band. The components of the eigenvectors of a tridiagonal $A$ form a Sturm sequenceSturm sequence, but wider bands begin to allow a certain number of coincident roots - related to the thicker nodes - which complicate discussions of the separation of frequencies, which bespeaks a possibility for degeneracy which is absent from tridiagonal matrices.

Since tridiagonal matrices already show most of the characteristics of polydiagonal matrices, but with $2x2$ matrices $M_i$, studying their properties is a good place to begin. To avoid working with fractions, the previous matrix elements can be assigned some letters, to obtain

\begin{eqnarray*}
\left[ \begin{array}{c}
x_{i+1} \\ x_i
\end{array} \right]...
...
\left[ \begin{array}{c}
x_i \\ x_{i-1}
\end{array} \right]
\end{eqnarray*}



Observing that

\begin{eqnarray*}
\left[ \begin{array}{cc}
1 & 0 \\
1 & -1
\end{array} \ri...
...begin{array}{c}
x_{i+1} \\ x_{i+1} - x_i
\end{array} \right]
\end{eqnarray*}



it is easy enough to transform the equation into

\begin{eqnarray*}
\left[ \begin{array}{c}
x_{i+1} \\ x_{i+1} - x_i
\end{arra...
... \begin{array}{c}
x_i \\ x_i - x_{i-1}
\end{array} \right].
\end{eqnarray*}



The point of this exercise is to use a vector consisting of displacements and an approximation to their derivatives, and to show that the sequence of such vectors is generated the same way as before, albeit with a different matrix, whose characteristic equation has to be the same because of the similarity transformation.

The original form might as well be retained, along with the characteristic equation for the wave number,

\begin{eqnarray*}
\left\vert \begin{array}{cc}
a \lambda - b - \mu & - c \\
1 & - \mu
\end{array} \right\vert & = & 0,
\end{eqnarray*}



or

\begin{eqnarray*}
\mu^2 - \mu ( a \lambda - b ) + c & = & 0.
\end{eqnarray*}



There is advantage to dividing the characteristic equation by the nonzero $\mu \surd(c)$ ($M$ must be invertible, so no root is zero; $c$ cannot vanish) and writing

\begin{eqnarray*}
\frac{\mu}{\surd(c)} + \frac{\surd(c)}{\mu} & = & a \lambda - b.
\end{eqnarray*}



The advantage lies in seeing the dispersion relation immediately, but also presenting the lefthand side as $\cosh(t)$, permissible if $\mu/\surd(c) = \exp(t)/2$. Should this series of substitutions seem somewhat contrived, bear in mind that we really want to raise $M$ to powers, and that multiplying logarithmns is a good way to do that. Not to be overlooked is that $c$ would be 1 if the two bands of $A$ were equal, and that balancing powers of $\mu$ against their reciprocals to get hyperbolic cosines eventually depends of the form of the bands in a polydiagonal $A$.

Calling the two roots of the characteristic equation $\mu +$ and $\mu -$, the (unnormalized) matrices of eigenvector rows and eigenvector columns are

\begin{eqnarray*}
U^{-1} & = & \left[ \begin{array}{cc}
\mu + & - c \\ \mu - ...
...egin{array}{cc}
\mu + & \mu - \\ 1 & 1
\end{array} \right] .
\end{eqnarray*}



With eigenrows and eigencolumns, the spectral decomposition is immediate:

\begin{eqnarray*}
M & = & \frac{\mu +}{<+\vert+>} \vert+><+\vert + \frac{\mu -}{<-\vert->} \vert-><-\vert
\end{eqnarray*}




next up previous contents
Next: Sturm sequences Up: Band Matrices Previous: Band Matrices   Contents
Pedro Hernandez 2004-02-28