There is another way in which an imbalance may occur. Let us suppose that each link is labelled by the state into which its neighborhood evolves. A path made of such links has evolved from some state of the automaton, namely the one formed by reading off the sequence of overlapping neighborhoods rather than the sequence of evolved cells. We may invert this process. Suppose that we are given an arbitrary sequence of states. If we can locate it somewhere in the diagram, then we have found at least one extended neighborhood which is its ancestor. Sometimes we can locate the sequence along several paths; but it might also occur that such a path is nowhere to be found.

Although the number of links emanating from each node is the same, it
is up to the particular rule what the mixture of labels will be. For
example, Rule 0 would show a label of zero on * all* links, with a
one nowhere to be found. Which is correct; it is a rule under which no
ones can evolve. It is a dangerous situation if there is an imbalance
at a certain node, so that some state does not label a link there and
some other state labels more than one link. If one is ever obliged to
use that node in forming a chain of evolved cells, the combination in
which the missing cell is the next one required will be impossible. It
may or may not happen that the desired sequence can be can be located
somewhere else in the diagram, so we need an accurate determination of
just which sequences can be folded up and fitted into the diagram.

Moore's subset construction[84] fulfills this requirement very nicely. The basic idea is that if we were systematic, we would start with the first cell of our evolved sequence and look through the diagram taking note of all the links matching that cell. They would emanate from a subset of nodes, so we could make a new diagram in which that subset was a node. The nodes to which the acceptable links lead form a new subset, which we write down and connect to the first subset by a single link with the common label. Turning to the second cell of the evolved sequence, we would examine only those nodes in the second subset node to see which new subset we might form and link with the second cell as label. It would be pointless to examine more nodes in the de Bruijn diagram, because they wouldn't even make the first cell in the evolution come out right.

The final result would be a network of subsets linked and labelled with the same labels used in the de Bruijn diagram. There is no reason not to complete the diagram showing other subsets and other links, but there are exponentially many subsets and they may not all be required if the ancestor of only one chain is sought. The difference is that the nodes and links would now cover all the possibilities in a systematic fashion rather than after the alternative laborious search. In effect the search has already been made, but in an orderly way. Note that in the complete subset diagram, every node has exactly one emerging link for each of the k states of an individual cell; thus one can never reach an impasse seeking a route through the subset diagram.

What one may find instead is that one has arrived at the empty set, which is simply a polite way of saying that the search has been fruitless.

The subset diagram serves a dual purpose. By avoiding the empty set, it is possible to obtain all the possible configurations of the second generation; since the diagram is finite and sequences of cells can be arbitrarily long, the diagram necessarily contains loops, just as does the de Bruijn diagram. By seeking out the empty set, the class of sequences is found which do not have ancestors, and which therefore belong to the Garden of Eden. By examining the final approaches to the empty set, but mainly by rejecting loops, sequences may be found which spoil any other sequence in which they appear, and thus are the basic excluded words for their rule.

E-mail:mcintosh@servidor.unam.mx