Common Lisp the Language, 2nd Edition

Next: Defining New Series Up: Optimization Previous: Basic Restrictions

## A.3.2. Constraint Cycles

Even if a series expression satisfies all of the restrictions above, it still may not be possible to transform the expression into a loop. The sole remaining problem is that if a series is used in two places, the two uses may place incompatible constraints on the times at which series elements should be produced.

The series expression below shows a situation where this problem arises. The expression creates a series x of the elements in a list. It then creates a normalized series by dividing each element of x by the sum of the elements in x. Finally, the expression returns the maximum of the normalized elements.

```(let ((x (scan '(1 2 5 2))))           ;Warning message issued
(collect-max (#M/ x (series (collect-sum x))))) => 1/2
```

The two uses of x in the expression place contradictory constraints on the way pipelined evaluation must proceed; collect-sum requires that all of the elements of x be produced before the sum can be returned, and series requires that its input be available before it can start to produce its output. However, #M/ requires that the first element of x be available at the same time as the first element of the output of series. For pipelining to work, the first element of the output of series (and therefore the output of collect-sum) must be available before the second element of x is produced. Unfortunately, this is impossible.

The essence of the inconsistency above is the cycle of constraints used in the argument. This in turn stems from a cycle in the data flow graph underlying the expression. In figure A-1 function calls are represented by boxes and data flow is represented by arrows. Simple arrows indicate the flow of series values and cross-hatched arrows indicate the flow of non-series values.

---------------------------------------------------------------- Figure A-1: A Constraint Cycle in a Series Expression +------+ +-----+ +-----+ -->| scan |----------------------------------->| #M/ |-->| max |--> +------+ \ +-----+ +--------+ / +-----+ +-----+ -->| sum |-++++->| series |-- +-----+ +--------+ ----------------------------------------------------------------

Given a data flow graph corresponding to a series expression, a constraint cycle is a closed oriented loop of data flow arcs such that each arc is traversed exactly once and no non-series arc is traversed backward. (Series data flow arcs can be traversed in either direction.) A constraint cycle is said to pass through an input or output port when exactly one of the arcs in the cycle touches the port. In figure A-1 the data flow arcs touching scan, sum, series, and #M/ form a constraint cycle. Note that if the output of scan were not a series, this loop would not be a constraint cycle, because there would be no valid way to traverse it. Also note that while the constraint cycle passes through all the other ports it touches, it does not pass through the output of scan.

Whenever a constraint cycle passes through a non-series output, an argument analogous to the one above can be constructed and therefore pipelining will be impossible. When this situation arises, a warning message is issued identifying the problematical port and the cycle passing through it. For instance, the warning triggered by the example above states that the constraint cycle associated with scan, collect-sum, series, and #M/ passes through the non-series output of collect-sum.

Given this kind of detailed information, it is easy to alleviate the problem. To start with, every cycle must contain at least one function that has two series data flows leaving it. At worst, the cycle can be broken by duplicating this function (and any functions computing series used by it). For instance, the example above can be rewritten as shown below.

```(let ((x (scan '(1 2 5 2)))
(sum (collect-sum (scan '(1 2 5 2)))))
(collect-max (#M/ x (series sum))))
=> 1/2
```

It would be easy enough to automatically apply code copying to break problematical constraint cycles. However, this is not done for two reasons. First, there is considerable virtue in maintaining the property that each function in a series expression turns into one piece of computation in the loop produced. Users can be confident that series expressions that look simple and efficient actually are simple and efficient. Second, with a little creativity, constraint problems can often be resolved in ways that are much more efficient than copying code. In the example above, the conflict can be eliminated efficiently by interchanging the operation of computing the maximum with the operation of normalizing an element.

```(let ((x (scan '(1 2 5 2))))
(/ (collect-max x) (collect-sum x))) => 1/2
```

The restriction that optimizable series expressions cannot contain constraint cycles that pass through non-series outputs places limitations on the qualitative character of optimizable series expressions. In particular, they all must have the general form of creating some number of series using scanners, computing various intermediate series using transducers, and then computing one or more summary results using collectors. The output of a collector cannot be used in the intermediate computation unless it is the output of a separate subexpression.

It is worthy of note that the last expression above fixes the constraint conflict by moving the non-series output out of the cycle, rather than by breaking the cycle. This illustrates the fact that constraint cycles that do not pass through non-series outputs do not necessarily cause problems. They cause problems only if they pass through off-line ports.

A series input port or series output port of a series function is on-line if and only if it is processed in lockstep with all the other on-line ports as follows: the initial element of each on-line input is read, then the initial element of each on-line output is written, then the second element of each on-line input is read, then the second element of each on-line output is written, and so on. Ports that are not on-line are off-line. If all of the series ports of a function are on-line, the function is said to be on-line; otherwise, it is off-line. (The above extends the standard definition of the term on-line so that it applies to individual ports as well as whole functions.)

If all of the ports a cycle passes through are on-line, the lockstep processing of these ports guarantees that there cannot be any conflicts between the constraints associated with the cycle. However, passing through an off-line port leads to the same kinds of problems as passing through a non-series output.

Most of the series functions are on-line. In particular, scanners and collectors are all on-line as are many transducers. However, the transducers in section A.2.4 are off-line. In particular, the series inputs of catenate, choose-if, chunk, expand, mask, mingle, positions, and subseries along with the series outputs of choose, split, and split-if are off-line.

In summary, the fourth and final restriction is that for optimization to be possible, a series expression cannot contain a constraint cycle that passes through a non-series output or an off-line port. Whenever this restriction is violated a warning message is issued. Violations can be fixed either by breaking the cycle or restructuring the computation so that the offending port is removed from the cycle.

Next: Defining New Series Up: Optimization Previous: Basic Restrictions

AI.Repository@cs.cmu.edu