The fact that the largest eigenvalue of a positive matrix is isolated means that powers of the matrix should converge to a multiple of the idempotent matrix formed from the corresponding eigenvectors. By normalizing the eigenvalue and one of the eigenvectors, as has already been done for stochastic matrices, the limiting form of high powers is even further simplified. The rate of convergence depends on the ratio between the largest and second largest eigenvalue, or directly on the second if the matrix is stochastic. Conversely, one way of estimating the second eigenvalue is to study the behaviour of matrix powers, preferably already having transformed the matrix to stochastic from.
This can be done by commencing with a positive vector and defining recursively
Then it will be found that the largest eigenvalue satisfies
which confines the eigenvalue to a succession of ever smaller intervals. The outermost, widest, interval follows from the minimax characterization of the largest eigenvalue. So do each of the others, by specializing the choice of the column vector in the same characterization, but the important part of this result is that the intervals are nested.
Nesting is a consequence of multiplying successive vectors by M, but it is sufficient to see how multiplication by relates to multiplication by M. Moreover it will be more understandable to write the definition of in terms of
the vector inequality being understood as holding for each component, and being understood as the greatest constant for which the inequality holds. Then we would have
the first inequality holds since itself serves for a general vector such as MX. Given that , we conclude that , since it is the factor which applies to this final equation.
There is a corresponding convergence of vectors which is interesting, especially since the arguments apply equally well to any product of stochastic matrices and not just to powers of a single matrix. For simplicity suppose that the family of matrices is column stochastic, which means that , the row of 1's, is a uniform eigenvector for the family. Then the dominant part of such a matrix would be the vector product of some column X with the row . The action of its action in multiplying any column would be to ``average'' the components of the column, just as its action on a matrix would be to form a new row of column averages. Strictly, an average would also include division by the number of terms summed.
One of the outstanding properties of averaging is its tendency to reduce variance. In the present context this would mean that the product of a large number of positive stochastic matrices would tend to a matrix with uniform rows, independently of whether the same matrix or different matrices were being multiplied. In the particular case of powers of a single matrix, including an additional factor would not change the limit, which would have to be a matrix of eigencolumns.
In verifying the details of the averaging process it is found that certain estimates are required for the size of the matrix elements. In particular, permitting very small or zero elements would be nice for certain applications, but would either complicate or invalidate the proofs. Here we only present one of the preliminary theorems, which describes the reduction in range of the elements of a vector with real components when multiplied by a stochastic matrix.
Suppose that M is the matrix, that X is a real vector whose algebraically largest and smallest components are A and a respectively, and that the corresponding extreme elements of MX are B and b, respectively. We also require the smallest element of M, which we might call . For the purpose of argumentation we introduce another vector Y constructed from X by replacing all of its elements save one of the smallest by A. Starting from the evident inequality
we obtain
and a requirement to estimate MY. Each component will have the form , which can be rewritten . The value of will vary from component to component, but it is always greater than , so
The easiest way to get an estimate for b is to repeat the same argument for -X, which reverses all comparisons. Combining the result,
with the bound for B, we find
By its nature, we have , so the range of variation of the components of MX is necessarily less than the corresponding range for X.
If we could rely on a fixed lower limit to for an entire family of matrices, it is clear that X would gradually be reduced to a multiple of U if it were multiplied by increasingly larger numbers of matrices of the family.