The fact that the largest eigenvalue of a positive matrix is isolated means that powers of the matrix should converge to a multiple of the idempotent matrix formed from the corresponding eigenvectors. By normalizing the eigenvalue and one of the eigenvectors, as has already been done for stochastic matrices, the limiting form of high powers is even further simplified. The rate of convergence depends on the ratio between the largest and second largest eigenvalue, or directly on the second if the matrix is stochastic. Conversely, one way of estimating the second eigenvalue is to study the behaviour of matrix powers, preferably already having transformed the matrix to stochastic from.
This can be done by commencing with a positive vector and
defining recursively
Then it will be found that the largest eigenvalue satisfies
which confines the eigenvalue to a succession of ever smaller intervals. The outermost, widest, interval follows from the minimax characterization of the largest eigenvalue. So do each of the others, by specializing the choice of the column vector in the same characterization, but the important part of this result is that the intervals are nested.
Nesting is a consequence of multiplying successive vectors by M, but
it is sufficient to see how multiplication by relates to
multiplication by M. Moreover it will be more understandable to write
the definition of
in terms of
the vector inequality being understood as holding for each component,
and being understood as the greatest constant for
which the inequality holds. Then we would have
the first inequality holds since itself serves for a general vector
such as MX. Given that
, we conclude that
, since it is the factor which
applies to this final equation.
There is a corresponding convergence of vectors which is interesting,
especially since the arguments apply equally well to any product of
stochastic matrices and not just to powers of a single matrix. For
simplicity suppose that the family of matrices is column stochastic,
which means that , the row of 1's, is a uniform eigenvector for
the family. Then the dominant part of such a matrix would be the vector
product of some column X with the row
. The action of
its action in multiplying any column would be to ``average'' the
components of the column, just as its action on a matrix would be to
form a new row of column averages. Strictly, an average would also
include division by the number of terms summed.
One of the outstanding properties of averaging is its tendency to reduce variance. In the present context this would mean that the product of a large number of positive stochastic matrices would tend to a matrix with uniform rows, independently of whether the same matrix or different matrices were being multiplied. In the particular case of powers of a single matrix, including an additional factor would not change the limit, which would have to be a matrix of eigencolumns.
In verifying the details of the averaging process it is found that certain estimates are required for the size of the matrix elements. In particular, permitting very small or zero elements would be nice for certain applications, but would either complicate or invalidate the proofs. Here we only present one of the preliminary theorems, which describes the reduction in range of the elements of a vector with real components when multiplied by a stochastic matrix.
Suppose that M is the matrix, that X is a real vector whose
algebraically largest and smallest components are A and a
respectively, and that the corresponding extreme elements of MX are
B and b, respectively. We also require the smallest element of M,
which we might call . For the purpose of argumentation we
introduce another vector Y constructed from X by replacing all of
its elements save one of the smallest by A. Starting from the evident
inequality
we obtain
and a requirement to estimate MY. Each component will have the form
, which can be rewritten
. The
value of
will vary from component to component, but it is
always greater than
, so
The easiest way to get an estimate for b is to repeat the same argument for -X, which reverses all comparisons. Combining the result,
with the bound for B, we find
By its nature, we have , so the range of variation of
the components of MX is necessarily less than the corresponding range
for X.
If we could rely on a fixed lower limit to for an entire
family of matrices, it is clear that X would gradually be reduced to
a multiple of U if it were multiplied by increasingly larger numbers
of matrices of the family.