Unfortunately the scheme which solves systems of symbolic linear equations tends to multiply the length of the solution by a constant factor, equal to the number of alternatives, for each node that has to be resolved by the star formula. Since the symbolic coefficients do not tend to simplify in general, symbolic solutions become impractical for all but the very smallest diagrams. We might also note that the table above describes a considerable variety of automata, and yet they are uniformly described by formulae of the same general appearance. Many of them can be further simplified, in spite of the fact that most cannot. For example the expression describing Rule 0 transforms into ; the descriptor of Rule 4, for which 11 is an excluded word, simplifies to .
Some additional remarks ought to be made concerning symbolic equations. The equations which we have shown are entrance equations, in the sense that says that you enter node C via link 1 from node A, via link 0 from node D or you do nothing. But it is also true that , wherein you leave node C via link 1 and continue from D, or else leave via link 0 and continue from F. The complete set of conditions would constitute a description of the de Bruijn diagram via exit equations, and can be equally solved by regular expressions.
Using the same example as before, the exit equations for Rule 22 would take the form
For many purposes, the entrance equations and the exit equations of a diagram give equivalent descriptions. However, the path running between the full set and the empty set which corresponds to the excluded words for a rule ought to be be defined by entrance equations rather than exit equations. After all, interest lies in how an impasse is arrived at, not in what to do once it has occurred.
We have not explained why has been incorporated as an alternative in each equation. Were it omitted, the equations would be shorter and thereby easier to solve. However the final equation for Rule 22 would then read whose solution according to Arden's lemma would the the empty set---that is, no solution. After all, the equation requires that C be equal to something strictly longer than itself. Nevertheless the ``infinite recursion'' is just the part of the solution we extracted to represent a cyclic or infinite state for this automaton.
There is evidently a philosophical issue involved, concerning the differences between our conceptions of finite and infinite. However, it is not a particularly difficult philosophy. Had we been more selective in the introduction of the 's, we would have paid more attention to the fact that they represent a starting point or a stopping point, according to the handedness of the equations. By including a in each of the entrance equations, we have implied that any node in the diagram could be a starting point, and thus we should be prepared to see that the cyclic part of the equation for each node is preceded by the transients leading into it from the other nodes.
If we are looking for finite paths using entrance equations, only the initial nodes should be provided with 's and the equations should be solved for the terminal nodes; not only will the equations be simpler, but the results will be more accurate. For exit equations, the situation is reversed; the terminal nodes get their 's in preparation for solving for the initial states.
If we are genuinely interested in infinite solutions, we can entertain alternatives to Arden's lemma, but we should beware that the result may depend upon the node which we have chosen for the final equation; it will only express a solution involving that node. It often happens that a diagram contains a loop which makes a transition to a second loop, but that there is no return path to the first loop. In de Bruijn diagrams for cellular automata, such a configuration corresponds to one of Wainwright's fuses. If the equations are solved for a node in the first loop, the existence of the second loop will not be evident.
The algebra of regular expressions is not as direct as one would like, and often questions involving regular expressions are most easily resolved by constructing an equivalent diagram, transforming the diagram, and converting the result back into a regular expression. For example, regular expressions form a Boolean algebra, but the definition of a regular expression only involves unions---the regular expression sum. Intersections, which represent words conforming to several regular expression descriptions, and complements, which are words which in no way conform to the description, are readily defined in terms of diagrams, but not by operations directly on the regular expression itself. For example, excluded words are obtained through the subset construction, even though they are then readily described by regular expressions.
A related problem is the one of determining the equality of two regular expressions, and is the reason that there is no canonical form for regular expressions. Equality can be determined by constructing equivalent diagrams, reducing them to their simplest form and comparing the results; however there is no unique regular expression corresponding to any given diagram.
Another result for which there is no simple calculus is to obtain the regular expression which results from the evolution through one or more generations of a given regular expression. The difficulty clearly lies in accounting for all the overlapping which has to be taken into account where sums are involved. It is easier to form the composite rule corresponding to the desired evolution and solve the equations converting the de Bruijn diagram of the composite rule into a regular expression.
Although each generation of evolution of an automaton can be described by a regular expression, the tendency is for the expression to become more and more complicated with each increasing generation. Likewise the description of the words excluded from each generation becomes increasingly complex as the exclusions from the previous generation have to be taken into account. One may naturally wonder what kinds of limit there may be to all this increasing complexity, and indeed whether the limit may be in some cases simpler than all the confusion which preceded it.
Generally speaking, the limit of a sequence of regular expressions need not be a regular expression. One has only to recall that balanced parentheses are not described by a regular expression, although parentheses nested to any finite depth can be listed explicitly and thus described by a regular expression. Here, then, is a limit which is conceptually far simpler than its approximations, but for which there are approximations more than adequate for practical purposes.
There is something qualitatively different between keeping a huge list of acceptable parenthesis sequences to check parentheses, and a huge list of binary numbers to check the regular expression ; the latter is simply unnecessary because a simpler procedure suffices. In our present situation we might find that we have to let a cellular automaton run for a very long time to find out something that another approach might discover much more quickly. Nevertheless, it is still interesting to know what the possibilities are, or when such a situation might be occurring.
Certain transformations can be carried out on cellular automata themselves which can be used to clarify their limiting behaviour. These transformations mostly go in the direction of showing how cellular automata can be used to simulate Turing machines, from whence many known results concerning computability and decidability can be passed over to cellular automata. These results in turn affect the possible types of limiting behaviour which may be encountered.