The rules of probability are widely known and frequently employed in a wide variety of contexts; nevertheless it is a probability which assumes such things as independent events and the lack of correlation. Under those conditions probabilities of alternatives are summed and probabilities of coincidences are multiplied. Sometimes it seems reasonable that the justifying hypotheses are valid, other times it is simply assumed so with the hope that the calculations will still be reliable.
When the evolution of cellular automata is treated probabilistically, it does not seem reasonable that the evolution of neighboring cells is independent, inasmuch as they share a common portion of their neighborhood and evolve according to deterministic rules. Nevertheless the rules are complicated, and the overlapping segment of the neighborhoods enters differently into the evolution of each cell, so the coupling between the two evolutions might not be all that strong.
Going ahead with naive probabilistic calculations for cellular automata produces results which are neither very good nor very bad. Tendencies are evident and the general conclusions seem to be reliable, but the results are clearly defective as soon as any accuracy is required.
One way to get better results is to work out the rules for more complicated combinations of probabilities, as is routinely done to get the probabilities for the values of all kinds of mathematical functions in terms of the probability distributions of their arguments and the fundamental arithmetic operations. This approach does not work well for cellular automata because of the complicated way that the rule of evolution would have to be expressed as an arithmetic function of a real variable.
A more tractable approach seems to be to work with probabilities for sets of cells rather than with individual cells; in particular, with the linear sequences of cells from which linear cellular automata are already composed. Thus correlations between cells are seen in the differences in the probabilities of different pairs of cells, which do not necessarily have to be the products of probabilities for the individual cells. Further correlations can be deduced by studying the probabilities of triples of cells, quadruples, and so on. The extreme logical limit to this approach would be to assign probabilities to each and every possible sequence, no matter how long.
Sequences of cells will be called blocks, to emphasize their linear arrangement; the term n-block will be used when its exact length is required. Some blocks form parts of others; indeed the best way to describe -blocks is by systematically extending n-blocks in all possible ways; since they are linear, the extension can be made to the left or to the right. Insertions could be made in the middle, but that will be regarded as a compound extension.
The assignment of probabilities to blocks ought to respect the process of extension. Since there are more extensions than bases, each of them should have a smaller probability than its base; moreover the probability of the base could be expected to be the sum of the probabilities of all its extensions. In this way a constant amount of probability will be subdivided more and more finely as the blocks grow longer.
When formalized, these restraints are called the Kolmogorov consistency conditions. For linear sequences there are two such conditions, according to the handedness of the extension. Thus, if abc were a 3-block, the collection of cell states, the requirements are
By repeated application of the consistency conditions the probabilities of all m-blocks for are determined once all the n-block probabilities are known. But because there are two consistency conditions, the choice of n-block probabilities is not completely arbitrary. For example, in a binary alphabet, 1 extends to the left to become 01 or 11, to the right to become 10 or 11. This implies that or in short that , which is a restraint on the assignments of probabilities to 2-blocks.