Reduction of a power series modulo the characteristic polynomial

Next: Sylvester's formula Up: Putzer's Method Previous: The Cayley-Hamilton theorem

Reduction of a power series modulo the characteristic polynomial

Of course, the matrix version of an analytic function could just be defined by its power series,

f(M)

$\displaystyle \sum_{i=0}^{\infty} a_i M^i,$

(10)

as long as the eigenvalues lay within its circle of convergence. The direct definition is unattractive unless the power series converges rapidly or perhaps can be summed symbolically. Usually the matrix would be diagonalized and functions of the eigenvalues calculated, a procedure presenting problems of its own.

Since a matrix satisfies its own characteristic equation,

Mⁿ

$\displaystyle \sum_{i=0}^{n-1} c_i M^i,$

(11)

according to the Cayley-Hamilton theorem, it might seem that a good way to reduce any matrix function to a polynomial would be to reduce the power series modulo the characteristic polynomial. Although some effort might be required to get the new coefficients, the task of calculating high powers of a matrix could thereby be avoided.

The reduction can actually be performed by using the companion matrix of M,

$\displaystyle \left[ \begin{array}{cccccc} {\bf0}& {\bf 1}& {\bf0}& \ldots & {\... ...}{c} I \\ M \\ M^2 \\ \ldots \\ M^{n-2} \\ M^{n-1} \\ \end{array} \right]$

$\displaystyle \left[ \begin{array}{c} M \\ M^2 \\ M^3 \\ \ldots \\ M^{n-1} \\ M^{n} \\ \end{array} \right]$

(12)

Call C the matrix in this equation, which is a Kronecker product of the companion matrix with the $n\times n$ identity matrix, and V the vector formed by the first few powers of M, which is also a Kronecker product. Since M is an eigenvalue of C with eigenvector V, it follows

f(M)V

f(C)V.

(13)

Consequently, it is only necessary to read off the first row of f(C)V to get f(M) as a linear combination of the low powers of M. However, this result still requires summing a whole power series, so it may offer no advantages as a numerical procedure. The essence of Putzer's method, which is dedicated to calculating matrix exponentials, lies in obtaining e^Ct as the solution of a differential equation, then deducing e^Mt. While a meritorious route to solving to this particular application, the unwary might not notice that the Jordan decompositions for M and C need not coincide, and yet the method still yields a true result.

The companion matrix C is not a canonical form, because its Jordan blocks have dimension equal to the multiplicity of each eigenvalue, whereas those of M may decompose into smaller blocks in any way which is consistent with the overall multiplicity. They keep the same symmetric functions and hence the same companion matrix, even though equivalence through a change of basis would want to write the companion matrix as a direct sum of smaller companion matrices. On the other hand, C maximizes any other arrangement by containing all the other Jordan subspaces, which is why the numerical methods work; nilpotents in the full Sylvester's formula will all have expired by the time their presence could have caused any interference.

In subsequent articles some commentaries on the method were published, but the general context of canonical forms seems to have been overlooked in the pursuit of minor improvements and extensions. Since the coefficients of the companion matrix are symmetric functions of the eigenvalues of M, it might seem that the eigenvalues could be put to better use by just evaluating f rather than starting up a whole new procedure. However, all that can be avoided by observing that the symmetric functions can be obtained directly by using the traces of low powers.

Since those matrices are going to be used anyway, taking their traces represents little additional labor. There is a series of identities relating power sums to symmetric functions due to Newton,

$\begin{displaymath}\left. \begin{array}{rcl} c_{n-1} & = & - {\rm Trace}\ (M),... ...)\ {\rm Trace}\ (M^2) \\ & \cdots & , \end{array} \right\} \end{displaymath}$

(14)

which could be used. The whole procedure has been consolidated into a single recursion by Faddeev (see [31] vol. 1, ch. IV, $\S$ 5; [22]), by defining the recursive chain

$\begin{displaymath}\left. \begin{array}{rclrclrcl} N_1 & = & M & c_{n-1} & = &... ...rm Trace}\ (N_{n}) & A_0& = &N_n-c_0I \end{array} \right\} \end{displaymath}$

(15)

In other words, the symmetric functions of the roots, which are the coefficients in the characteristic polynomial, can be read off right away as traces of the auxiliary matrices N_i.

Next: Sylvester's formula Up: Putzer's Method Previous: The Cayley-Hamilton theorem

root
2000-03-17