Next: Largest eigenvalue Up: Positive matrices Previous: Eigenvalues on the

Minimax principle

Using Dirac's notation for row and column vectors, the equations

used in deriving the Gerschgorin disks would arise directly from the so-called Rayleigh quotient, were it not for the fact that an inequality can only be stated for real quantities. However the necessary absolute values entering the equations could be avoided, given an assurance that all the terms were real, as perhaps might be anticipated for the largest eigenvalue of a real matrix.

It is generally understood that eigenvalues and eigenvectors satisfy a variational principle; for symmetric or hermitean matrices this leads to bounds involving the Rayleigh coefficient or to the the Courant minimax principle. If we suppose that and are vectors deviating slightly from and respectively, and ignoring quantities of second order we know that

Requiring the vanishing of the first order terms independently of and establishes the Rayleigh quotient as an eigenvalue, providing that and are its corresponding eigenvectors. However it would be preferable to work directly from explicit inequalities and not have to extrapolate from extremals to bounds.

Consequently, without yet specifying P and Q, define

If P were the set of coordinate vectors and Q were the set of vectors with positive components, the Rayleigh quotient would correspond to the right hand side of the Gerschgorin equations. Interchanging the roles of P and Q in these same equations would produce results for column sums equivalent to those which we would otherwise obtain for row sums.

With these choices of P and Q, it is easy to follow the rationale behind the definition of r; vector by vector, we decide which component suffers the least magnification, even if it is very, very small. At least we know that the vector as a whole will receive at least that much magnification. Noone can blame us for finally choosing the vector with the greatest overall magnification, the value of which is r.

Nevertheless, the conservatism involved in singling out the component with the least magnification means that the entire vector will always be magnified slightly more then the worst case estimate unless in fact each and every component is magnified by the same amount---in which case we have an eigenvector on our hands. Likewise, supposing the magnification is not uniform for all components, we could construct another vector whose minimum magnification was slightly larger by simply reducing the size of the minimally amplified component relative to the others. However, this latter procedure would definitely fail for the vector of maximum minimal amplification, so it would just have to be an eigenvector.

Similar considerations apply to the definition of s, since we would be estimating the maximum magnification that M could possibly impart to a transformed vector, and examining the vector whose estimate was the least extravagant, only to discover that the estimate was quite valid throughout the whole vector, and so to conclude once again that we had an eigenvector and that s was its eigenvalue.

To be certain of our conclusions we need to be assured that zero matrix elements do not occur in inconvenient places, and also that the limits involved in assuming the existence of maxima or minima over sets of vectors exist; the reason that the Rayleigh quotients contain the denominators which they do is to allow us to confine our attention to vectors of fixed norm, and thus to compact sets. The question of the possible configurations of zeroes does indeed reward more careful consideration, because of the possibility of finding disconnected portions or cyclic structures within M. Nevertheless, the central role is played by those matrices whose elements are strictly positive.

Next: Largest eigenvalue Up: Positive matrices Previous: Eigenvalues on the

Harold V. McIntosh
E-mail:mcintosh@servidor.unam.mx