One of the goals of any theory of the structure of an algebraic system is to decompose the system into standard combinations of standard components; in the case of vector spaces the result is to express any vector as a linear combination of a certain number of basis vectors. However, a vector space represents the ultimate in uniformity; the typical decomposition in other systems may be neither as complete nor as uniform, but there are usually some vestiges of both a basis and of coefficients.
When the system has both a multiplication and an addition, these terms bearing their usual connotations with respect to things like the commutative and the distributive laws, the principal object of interest is the behaviour of products. If there is a subset, such as typically closed with respect to addition, and an entire set we may classify products according to which of these two sets contains them, particularly with respect to the source of the factors. The most notorious example of this relationship is the fact that the product of any factor with zero results in zero; the idea is to generalize the properties of zero.
The following table expresses the possible relationships:
Any subset which is simultaneously a left ideal and a right ideal is simply called an ideal; the set consisting entirely of zero is the prototypical ideal, but so also is the entire set .
In the case of regular expressions, Conway has explored these concepts directly, without making any reference either to ideals or to the general theory of semigroups from which the results might possibly be taken. In doing so he exploits a natural order which exists for regular expressions, or any system having the regular expression operators; namely
Zero is clearly a minimum element for this order; the goal is to factorize any regular expression E as a product of other regular expressions, but this may not be possible; thus one must work up to the concept in stages. The first is to define a trial factorization through the equation
and then define successively
Amongst others, the canonical form of a regular expression provides trial factorizations whose terms are letters and derivatives. By taking further derivatives, trial factorizations may be developed with arbitrary words as leading terms. The point of starting with trial factorizations is the expectation that they can be promoted to factorizations by systematically increasing each of their terms.
In doing so we depend upon the fact that sums provide the route to upper bounds. Thus if and , follows readily from the definition of inequality; as long as we do not insist that the result necessarily be a regular expression, the same conclusion also holds for arbitrary sums. Consequently the maximal element less than E is simply the sum of all the elements less than
A product is linear in each of its terms individually, so replacing G by summing the set
gives a new trial factorization dominating the first. Moreover the replacement does not spoil the maximality of any of the other terms; if a maximal term were increased, the distributive law shows that the expression which previously offended the inequality would still be present. Consequently trials can be refined one term at a time until they are finally factorizations.
Without further restriction, a factorization of E need not be unique; the trial factorizations even less so. One such restriction is that all terms but one be maximal; there is only one way the remaining term can be refined, and that must produce the term itself if it were already maximal. The concept of factor, defined as a maximal term, depends upon the environment in which it is encountered, but the environment can be reduced to a product of two terms for either a left or a right factor, or three terms for an interior factor, by using the associative law to consolidate the remaining terms.
Finally it is possible to conclude that, whatever the number of left factors, they are paired with unique right factors; thus they can be given a common index set reflecting this correspondence; let us call them and they all satisfy each of which is dominant. It is a temptation to combine them with each other. Thus we further define as the maximal solution to the trial factorization
Several special cases come to mind. Since , there must be an index for which and likewise an index r for which and Finally, since we conclude
As for the L's and R's, note that by the definition of inequality, so that making a trial factorization with maximal terms---a factorization. Consequently and similarly . Thus the 's can be arranged into a square matrix, wherein the left factors form a certain row, the right factors a certain column, and E lies at their intersection. The 's in general behave like elementary matrices, satisfying the following rules:
One consequence of these rules is that by extending inequality elementwise to matrices, which implies that
Interesting as it may be, this discussion has so far been purely theoretical, depending on maximizing through the formation of arbitrary sums. Turning to the canonical representation and symbolic derivatives it is possible to have more concrete results. Note that a right factor of E is a maximal R satisfying in terms of words belonging to L this means
The pairing between L and R is quite evident from the structure of this formula. By building up the parallel theory of right derivatives, a similar relationship can be created to express L in terms of If it is found inconvenient to work with intersections, de Morgan's rules for regular expressions can be invoked to obtain
using the notation that is the complement of X, which may itself require some effort to obtain.