The language

The essence of REC is the use of four symbols of control (the two parentheses, colon, and semicolon) together with operators and predicates. Operators are just subroutines; predicates have a truth value in addition.

As usual, a pair of balanced parentheses is used for delimiting groups of symbols, but the group is also to be regarded as a predicate whose truth value depends on how its execution was terminated. Execution proceeds from left to right in the normal manner of reading English text, although the sequence can be interrupted by predicates, colons, and semicolons; all REC programs must be parenthesized.

A colon signifies repetition from the opening left parenthesis of any given level, being the embodiment of iteration in REC . On the other hand a semicolon implies proceeding at once to the closing right parenthesis, while assigning the value ``true'' to the process just completed. In point of fact one does not include the right parenthesis as part of the departure, but continues from the symbol just beyond it; meeting a right parenthesis as a normal part of a sequence also terminates the subexpression, but assigns it the value ``false.''

This leaves the role of predicates in a REC expression to be explained. Regular expressions themselves are indeterminate, since they describe the whole class of strings obtained through arbitrary choices of alternatives and iterates; a computer program requires a concrete choice at each juncture. Predicates provide the decision; when true the sequence of symbols continues without interruption, but falsity implies skipping to a new program segment.

The new segment is the one which immediately follows the nearest colon, semicolon or right parenthesis at the same parenthesis level. In the case of the parenthesis, the value ``true'' is assigned; this is the mechanism through the boolean complement of a truth value may be achieved. In other words, if p is a predicate, (p) is its logical negative. Likewise, when p and q are predicates, (p;q;) corresponds to p or q and (pq;) to p and q.

Arbitrary boolean combinations are possible; for example exclusive or corresponds to (p(q);q;), but one must beware that q could get executed twice without being reproducible (for example, by reading the keyboard twice). Such dilemmas are not frequent, but they do occur and lead to schemes for preserving information, defining variables, and what not; none of these are dealt with at the level of skeletal REC programming.




Harold V. McIntosh
E-mail:mcintosh@servidor.unam.mx