Physical Symbol Systems

Physical Symbol Systems

Overview. In Physical Symbol Systems (Newell, 1980a), Allen Newell has attempted to define precisely what he considers to be the central concerns of Artificial Intelligence and Cognitive Science. He advances the notion that general intelligence is a property of physical symbol systems, a somewhat precisely stated version of familiar AI symbolic processing systems. This hypothesis was proposed in Newell & Simon (1972, 1975) and endorsed in Newell & Simon (1987) and Vera & Simon (1993, 1994). Newell argues (and we agree) that the Physical Symbol System Hypothesis "sets the terms" on which Artificial Intelligence scientists search for a theory of mind (Newell, 1980a, p. 136). As such, it is a compelling subject for an interactivist critique -- demonstrating how such an influential notion within Artificial Intelligence is committed to encodingism reveals the foundational flaws within the Artificial Intelligence programme. In addition, we will discuss Newell's Problem Space hypothesis and the SOAR "cognitive architecture" of Newell and colleagues, a project consciously carried out following the Physical Symbol System Hypothesis, to illustrate how the foundational flaws of Artificial Intelligence weaken specific projects.

As a model of the workings of computers, we have no major objections to the Physical Symbol System Hypothesis. But, in claiming that general intelligence is a property of such systems, the hypothesis makes claims about cognition more broadly, including representation. It is here that we find fatal flaws -- the flaws of encodingist assumptions about the nature of representation.

A central notion in the Physical Symbol System Hypothesis for issues concerning representation is that of access. Access is a strictly functional relationship between a machine and some entity. Internal to the machine, such an accessible entity could be a symbol, an expression, an operator, or a role in an operator. Access basically means that the machine can operate on whatever it has access to -- for example, retrieve a symbol, change an expression, and so on. Assign is an operator that assigns a symbol to some such internal entity, thereby creating access to that entity. Access to such an assigned symbol yields access to the entity to which that symbol is assigned. Assignment, then, creates a kind of pointer relationship that constitutes functional access.

The next major notion for our purposes is that of designation.

Designation: An entity X designates an entity Y relative to a process P, if, when P takes X as input, its behavior depends on Y. (Newell, 1980a, p. 156)

In other words, "the process behaves as if inputs, remote from those [that] it in fact has, effect it. ... having X (the symbol) is tantamount to having Y (the thing designated) for the purposes of process P" (1980a, p. 156).

Such a remote connection is created "by the mechanism of access, which is part of the primitive structure [of the machine] ... It provides remote connections of specific character, as spelled out in describing assign" (1980a, p. 156). To this point, we have a description of various sorts of functional relationships and possibilities internal to a machine.

We next find, however: "This general symbolic capability that extends out into the external world depends on the capability for acquiring expressions in the memory that record features of the external world. This in turn depends on the input and behave operators" (1980a, p. 157). "Input ... requires its output symbols to reflect an invariant relation to the state of the external environment (via states of the receptor mechanism)" (1980a, p. 167).

And, finally: "Representation is simply another term to refer to a structure that designates:

X represents Y if X designates aspects of Y, i.e., if there exist symbol processes that can take X as input and behave as if they had access to some aspects of Y" (1980a, p. 176).

A representation, then, is a representation by virtue of the fact that it designates what it represents, and it designates something insofar as it provides access to it. Again, as a model of the internal workings of a machine, this is largely unobjectionable. When it is extended to epistemic relationships between the machine and its environment, however, it fails.

Critique. Considered from an interactivist perspective, one of the most perspicuous characteristics of the physical symbol system is its severe incompleteness. For comparison, recall that interactive representation consists of three aspects:


* Epistemic Contact. Interactions with an environment terminate in one of two or more possible internal final states, thus implicitly differentiating the environment with respect to those possible final states. This is the epistemic contact aspect of representation -- the manner in which interactive representations make contact with particular environments.


* Functional Aspect. Internal states or indicators, generally constructed with respect to dependencies on such final states, influence further system processing. This is the functional aspect of representation and is the only role representations can play within a system.


* Representational Content. Through influencing goal-directed interaction, which either succeeds or fails in achieving its goals, representational content emerges in the organization and functioning of a system as falsifiable implicit interactive predications about the environment. Representational content has truth value that is fallibly determinable by the system itself, not just by an observer.

The Physical Symbol System Hypothesis, in contrast, focuses primarily on an "intelligent" system having processes that operate on and transform internal symbol structures -- expressions. This is an abstraction of the model of a computer program operating on a data structure. The Physical Symbol System Hypothesis is not a full statement of even the functional aspect of representation (though it gestures in that direction), because the focus is on the transformations of internal records rather than on the influence of internal states on further processing, and because that notion of transformations does not in any essential way depend on action or interaction. The further processing is, in general, merely more manipulations of internal "records."

Thus, even though the focus of the Physical Symbol System Hypothesis is primarily on functional characteristics, it is nevertheless incomplete even with respect to the functional aspect of representation. The physical symbol system definition emphasizes processes that generate new internal "representations" out of already present "representations." That is, the Physical Symbol System Hypothesis defines process in a manner that presupposes issues of representation -- processes operate on "symbols" -- instead of providing an account of the emergence of representation out of process (Bickhard & Richie, 1983). A model of the emergence of functional processes must be independent of issues of representation, because function is logically prior to representation, with the emergence of representation then modeled within that framework of functional processes (Bickhard, 1993a).

The Physical Symbol System Hypothesis has it backwards: it assumes that representation can be defined prior to process, and then that processes can be characterized in terms of their effects on representations. It does not recognize that the functional influence of internal states on further processing is the limit of what internal states can do or be, and that a model of representation must be consistent with that fact. And it does not recognize the importance for representation -- for genuine symbols -- of interactive processes at all. Consequently, there is nothing in this model that provides either epistemic contact or representational content. The core hypothesis, in fact, does not even address the issue of representational content.

The Physical Symbol System Hypothesis does, however, make a gesture toward epistemic contact in the notion that the operator input generates symbols that "reflect an invariant relation to the state of the external environment" (1980a, p. 167). Such an invariant relationship is taken to provide representation, designation, and access to that state of the external environment.

There is an ambiguity here between two different notions of "access." Internal to a machine, a symbol can provide access to some other entity by providing a pointer to it. Alternatively, one entity could provide a kind of access to another by virtue of the first entity constituting a copy or an isomorph of the second. In this case, the machine could function in ways sensitive to features of the "designated" entity simply because the first entity provided the same features as the second. An important property of designation -- transitivity -- fails to distinguish between these two possibilities: "if X designates Y and Y designates Z, then X designates Z" (1980a, p. 157). Both pointer access and isomorphy are transitive.

External to a machine, however, the two possibilities are on quite different footings. Pointer access cannot exist to the environment in the sense in which it does internal to the machine: internal access is simply assigned, and is a primitive in the architecture of the machine, e.g., in the hardware for memory retrieval. There is no such primitive for the environment. It might be claimed that pointers for the environment could be constructed that would permit retrieval via various actions and interactions with that environment, such as providing a spatial location of the designated entity. This is certainly correct, but the machine's interpretation of such pointers involves representational issues, and thus would be circular as a foundation for a model of representation. If the pointers are taken to be simply commands to the operator behave that accomplish the required actions for retrieval -- without representational interpretation -- then we have at best a control system that can arrive at various locations in accordance with internal controls. Issues of representation, including representation of whatever it is that is at the "designated" location, are not addressed by such a model.

Newell emphasizes the invariance of relationship between the internal "symbols" from the operator input and states of the environment. He does not present a pointer relationship for input. Such an invariance of relation to the environment is a general form of isomorphy or tracking or correspondence with that environment. This is also the kind of relationship emphasized, for example, by Vera & Simon (1993). These relationships too are quite possible and important. They provide possible control relationships between the environment and internal processes of the machine, such as a photocell opening a door, or a thermostat adjusting a furnace, or a pin-prick evoking a withdrawal in a flatworm, or a keystroke on a keyboard triggering various activities in a computer, and so on.

Such factual correspondences are crucial to effective and appropriate sensitivity of the machine or system to its environment. They provide the possibility of such sensitivity because they provide the possibility for control influences, control signals, reaching from the environment into the system -- and, therefore, the possibility for the system to respond to those signals, to be controlled by those signals, thus manifesting the required sensitivity.

Such control signals, however, do not provide any representational relationships. They are factual relationships of correspondence or tracking that provide the possibility for control relationships of process evocation or other process influence. They might provide a minimal form of epistemic contact (they are a minimal -- passive -- version of an "interactive" differentiator), but they provide nothing toward representational content.

In particular, to assume that these internal states correspond to objects or entities in the world, and thereby represent those objects, is to fall prey to encodingism. Such correspondences, should they be definable, may be clear to us, as observers/users of the system, but how is the system itself supposed to know them? A theory of mind needs to explain how a system can know about the world, not simply presuppose that the system has this knowledge. The lack of a solution to this problem is precisely the empty symbol problem -- the system can shuffle symbols endlessly, but these symbols remain contentless, ungrounded. As a hypothesis about the internal workings of a computer, the Physical Symbol System Hypothesis captures some important functional properties. As a hypothesis about cognition, however, the Physical Symbol System Hypothesis is fatally deficient.

And it is flawed precisely because of its commitment to encodingism. Given our argument thus far, this should be no surprise; making the point with respect to a well-known project in Artificial Intelligence, however, illustrates concretely the pervasiveness of encodingism in AI.

Newell bounces between the horns of the computer-versus-cognition dilemma. He clearly is most interested in (and on the safest ground!) viewing symbols solely as internal states.

The primitive symbolic capabilities are defined on the symbolic processing system itself, not on any external processing or behaving system. The prototype symbolic relation is that of access from a symbol to an expression [i.e. another internal object], not that of naming an external object. (Newell, 1980a, p. 169)

However, he occasionally states that, even though it is of secondary importance, symbols can correspond to objects in the world.

Then, for any entity (whether in the external world or in the memory), ... processes can exist in the symbol system ... that behave in a way dependent on the entity. (Newell, 1980a, pp. 156-157)

... the appropriate designatory relations can be obtained to external objects ... (Newell, 1980a, p. 169)

Our central critique of the Physical Symbol System Hypothesis, then, is that it focuses on the processing of internal indicators or "symbols" while giving no answer whatsoever to how these "symbols" can have representational content. As a framework for understanding cognition, this absence is fatal. Four additional points only darken the cloud of confusion.

First, Newell's notion of designation (which he later extends to representation) is so general as to be vacuous. Nevertheless, this is the ground -- and, therefore, the limit -- of Newell's attempt to address epistemic contact and content.

Designation: An entity X designates an entity Y relative to a process P, if, when P takes X as input, its behavior depends on Y. (1980a, p. 156)

This definition permits descriptions such as "a transmitter molecule docking on a cell receptor designates, relative to internal processes of the cell, the activities of the preceding neuron that released the transmitter" and "the octane of the gasoline put into the car's tank designates, relative to the internal processes of the engine, the octane of the gasoline in the underground tank from which it was filled" and "the key strokes on my keyboard designate, relative to the internal processes of the computer, the intentions and meanings that I am typing." These all involve various kinds of correlational and functional or control relationships, but none of them are representational relationships. This "model" is an impoverished "correspondence plus subsequent appropriate function" notion of encodingism. It is impoverished in the sense that the core of the entire definition is in the word "depends," but, as shown elsewhere (Bickhard, 1993a) and below, it is fundamentally inadequate even if that functionality is elaborated, even if "depends" is explicated (e.g., Smith, 1987).

Second, Newell has a deficient notion of a system being embedded in, and interacting with, its environment. Input and behave are just two not very important functions of a physical symbol system; there is no sense of the representational importance of interaction with an environment. And he has no notion whatsoever of the constitutive role of goal-directed interaction in representation.

Third, note that, although designation, and therefore "representation," are transitive, genuine representation is not transitive. If X represents Y -- e.g., X is a name for Y -- and Y represents Z -- e.g., Y is a map of Z -- it does not follow that X represents Z. You could not find your way around merely by having the name for the appropriate map. This divergence with respect to transitivity is a clear difference between informational -- correspondence, tracking, isomorph, and so on -- relationships, and the possibility for control relationships that they provide, which are transitive, and true representational relationships, which are not transitive.

Finally, Newell mentions briefly that processing a symbolic representation can result in an "unbounded" number of new representations (1980a, p. 177). This is true, in the sense that applying a finite set of operators to a finite set of basic elements can result in an infinite set of non-basic elements. However, this process cannot result in fundamentally new representations. The infinite set of integers can be derived by applying one operator (successor) to one basic element (zero). Nevertheless, there is no way to derive the real numbers (nor anything else) from this set of basic elements and basic operations. If what is needed is not in the set, it does not matter that the set might be infinite. For example, the space of even integers is infinite, but that doesn't help much if you need an odd integer -- or rational or real or complex or quaternion or matrix or tensor or fibre bundle connection -- or representation of a car or a steak -- or democracy or virtue -- and so on.

The Physical Symbol System Hypothesis, then, at best captures some of the internal and external functional relationships that might exist in a computer, but it does not genuinely address any of the issues of representations. It can be construed representationally only by stretching the internal pointer relationships in and among data structures to an analogous notion of pointing to things in the world. But what is being "pointed to" in a computer is hardwired to be functionally accessible (and even then is accessed, not represented), and this has nothing to do with representation of the external world. On the alternative sense of access, correspondences simply do not constitute representations, no matter how useful they may be for various sorts of control relationships and the consequent functional sensitivities that they can provide. We are in strong agreement with the goal of naturalizing representation that is inherent in the very notion of a physical symbol system, but this hypothesis has not achieved that goal.

The Problem Space Hypothesis

Overview. In addition to the framework of the Physical Symbol System Hypothesis, SOAR is based on the secondary Problem Space hypothesis (Newell, 1980b). This is the hypothesis that all symbolic cognitive activity can be modeled as heuristic search in a symbolic problem space. In particular, Newell claims that reasoning, problem solving, and decision making can all be captured as searches in appropriately defined problem spaces.

A problem space is a set of encoded states interconnected by possible transformations. The space is usually an implicit space defined by the combinatoric possibilities of some set of basic encodings, and the transformations are similarly atomized and encoded. In this space, an initial state and a goal state are specified and the abstract task is to find a path from the initial state to the goal state via the allowed transformations. The general problem space model is supposed to capture variations across reasoning, problem solving, and decision making with corresponding variations in what the state encodings and the transformational encodings are taken to encode. There is a clear and fundamental dependency on the Physical Symbol System Hypothesis here, with its fatal presupposition of encodingism.

Critique. We find here a more subtle and even more serious consequence of the encodingist presupposition, however. The problem space hypothesis can be construed, in a minimal information form, as a trial and error search in a space of possibilities defined by the combinatoric space of some generating set of explicit atomic encodings. The criterion for the search is some structure of encodings that satisfies the goal definition. In more than minimal information cases, the search need not be blind trial and error, but can use heuristic information to enhance the process; it might even become algorithmic. But this not only presupposes encodingism in the presumed implementation of the problem space, it inherently restricts all such variation and selection searches to the combinatoric possibilities given by the generating set of atomic encodings.

In particular, there is no possibility in this view of generating new emergent representations as trials toward possible solution of the problem, as possible satisfiers of the goal criteria. The only "representational" states allowed are the syntactic combinations of already available atomic "representations." Put another way, the atomic encodings with which the system begins must be already adequate in order for the "cognitive" activity to possibly succeed, since no new representations outside of that combinatoric space are possible.

Newell, here, is committed to Fodor's necessary innateness ("pregivenness") of all basic concepts, with all of its bizarre consequences: inherent innate restriction on human cognitive capacities, innate but "non-triggered" representations for quarks and tensors and the U.S. Senate in the Neanderthal (since there is no way for evolution to have inserted those concepts since then), and so on (Bickhard, 1991b, 1991c; Piattelli-Palmarini, 1980). As pointed out earlier, Fodor's position is a massive reductio of the assumptions which Newell is presupposing (Bickhard, 1991a, 1991b, 1991c).

From a practical perspective, this means that the user of any hypothesis-space program must create all the necessary atomic encodings and must correctly anticipate which ones will be necessary in order for the system to work. Put still another way, the construction of emergent representations is one example of a cognitive process that cannot be modeled within the problem space hypothesis. Furthermore, historical problem solving -- in physics or mathematics or ethics -- does involve the creation of new representations -- representations not anticipated in the simple combinatorics of previous representations. Clearly, in this fundamental sense at least, the Problem Space Hypothesis is not adequate to model genuine intelligence.

In fact, most problem solving does not involve pregiven spaces of possible states and solutions: problem spaces. The construction of appropriate possible solutions -- which may involve the construction of emergent representations, and may or may not involve organizations of such "state" possibilities as problem spaces -- can often comprise the most difficult part of problem solving -- or reasoning, or decision making. Historical examples can even involve rational reformulations of what is to count as a solution -- rational reformulations of problem definitions (Nickles, 1980). Even in relatively trivial problems, such as missionary and cannibals problems, the generation of new elements and attributes for the basic state language, the generation of appropriate "action" representations, and theorem finding -- not just theorem proving -- concerning properties of the problem and the "problem space" can all be critical in effective and tractable problem solving (Amarel, 1981). The problem space hypothesis is, in-principle, incapable of capturing such cognitive phenomena.

SOAR

Overview. We turn now to the SOAR project (Laird, Newell, & Rosenbloom, 1986). The goal of the SOAR project is to define an architecture for a system that is capable of general intelligence. SOAR explicitly follows the Physical Symbol System Hypothesis, so it illustrates nicely the practical consequences of encodingism. As a "cognitive" system, SOAR is wholly a model of internal processing for a system, and needs a programmer/user to do all the representational work for it.

SOAR is fundamentally a search architecture. Its knowledge is organized around tasks, which it represents in terms of problem-spaces, states, goals, and operators. SOAR provides a problem-solving scheme -- the means to transform initial states of a problem into goal states. One of the major advances SOAR claims is that any (sub-)decision can be the object of its own problem-solving process. For example, if SOAR is attempting to play chess and does not know which move to make in a certain situation, it can pose the problem "choose which move to make" to itself; work on this in a new, subordinate problem-space; then use the result to decide what move to make in the original space. This property is referred to as universal sub-goaling.

Another claimed advance is the ability to combine sequences of transformations into single chunks. In SOAR, this is a richer process than just the composition of the component transformations. It allows, for example, for a form of generalization of the conditions under which the chunked transformation is to be applied. The process, of course, is referred to as chunking.

As should be clear, SOAR is a model of internal processing for symbol manipulation systems. Laird, Newell, & Rosenbloom are explicit about their user/programmer version of encodingism, stating that SOAR "encodes its knowledge of the task environment in symbolic structures." However, to be precise, it is not SOAR that does the actual encoding. Programmers do the actual representational work of encoding a problem in terms of states, goals, operators, and even evaluation metrics.

Critique. Thus, SOAR already can be seen to be just another example of an encodingist Artificial Intelligence system. However, since SOAR is well-known and influential, it is worth considering in a bit more detail how encodingism subverts the worthwhile goals of the project. We'll do this by considering how several interrelated aspects of SOAR that the authors take to be very important -- universal sub-goaling and chunking -- are weakened by SOAR's programmer-specified semantics.

Universal Sub-goaling. Laird, Rosenbloom, and Newell consider universal sub-goaling, the property of being able to do problem solving to make any decision, to be one of the most important contributions of SOAR. An example they discuss in detail is taken from the 8-puzzle problem. Suppose that at a given point, SOAR does not know whether it is better to move a tile in the puzzle right, left, up, or down. It creates a goal of choosing between these four operators and sets up a problem space to solve the goal. There are two methods that SOAR can use to do search in this space.


* If it has a metric for evaluating the goodness of states, it can apply each of the operators, use the metric to evaluate the resulting states, and decide to use the operator that resulted in the highest valued state. However, this is only possible if SOAR's programmer has provided it with an evaluation metric.


* If it does not have a metric, SOAR will continue to recurse until it solves the problem. That is, it will apply the operators and come up with four states among which it cannot distinguish. It will then set up the problem of deciding which of these states is best. It will continue on until it reaches the goal state.

That is, if SOAR's programmer has provided it with an evaluation metric, SOAR will use it, and, if not, SOAR will do a depth-first search. The flexibility of being able to use whatever evaluation metric a programmer provides is a convenient modularization of its search process, but it is not more than that. The ability to iterate its process of setting up (sub-)goals with associated problem spaces and evaluation metrics etc. -- so long as all the necessary encoding for all those problem spaces and metrics has already been anticipated and provided by the programmer (such encoding frameworks can sometimes be reused at higher levels of recursion) -- is, again, a convenient re-entrant modularization of the search process, but it is not more than that. And it is not even particularly convenient, given that all relevant information must be anticipated and pregiven by the programmer.

One example of this is SOAR's "learning" of a "macro-operator" solution to the eight-puzzle problem (Laird, Rosenbloom, Newell, 1986; Laird, Newell, Rosenbloom, 1987). These macro-operators constitute a serial decomposition of the general problem, where a serial decomposition is one in which the attainment of each successive subgoal leaves all previous subgoals intact. In this case, the successive goals have the form: 1) place the blank in the proper location, 2) place the blank and the first tile in the proper locations, 3) place the blank and the first two tiles in the proper locations, and so on. On the one hand, SOAR's ability to develop this macro-operator solution is deemed to be of "particular interest" because SOAR is a general problem solver and learner, rather than being designed specifically for the implementation of macro-operators (Laird, Rosenbloom, Newell, 1986, p. 32). On the other hand, in order for SOAR to accomplish this feat, it must be fed two complete problem spaces -- one defining the basic eight puzzle and one defining the macro-operator version of the eight puzzle (Laird, Rosenbloom, Newell, 1986; Laird, Newell, Rosenbloom, 1987). Further, it must be hand tutored even in order to learn all the macro-operators, once fed their definitions (Laird, Rosenbloom, Newell, 1986, p. 37) (though this is probably a matter of speed of computation). Still further, the macro-operator characterization of the eight-puzzle is itself due to Korf (1985) (or predecessors), so this is an example of a historically and humanly developed problem space characterization -- not one developed by SOAR or by any other program. In sum, SOAR can accomplish a serial decomposition of the eight-puzzle problem if it is fed a basic eight-puzzle problem space and if it is fed a macro-operator space capturing that serial decomposition that someone else has already figured out. This is an enormous collective labor for an "accomplishment" that is in fact rather boring. Truly, SOAR programmer(s) must do all of the hard work.

The claims made for Universal Subgoaling, however, are extreme indeed. It is claimed, for example, that "SOAR can reflect on its own problem solving behavior, and do this to arbitrary levels" (Laird, Newell, Rosenbloom, 1987, p. 7), that "Any decision can be an object of goal-oriented attention." (Laird, Newell, Rosenbloom, 1987, p. 58), that "a subgoal not only represents a subtask to be performed, but it also represents an introspective act that allows unlimited amounts of meta-level problem-space processing to be performed." (Rosenbloom, Laird, Newell, McCarl, 1991, p. 298), and that "We have also analyzed SOAR in terms of concepts such as meta-levels, introspection and reflection" (Steier et al, 1987, p. 307). It would appear that SOAR has solved the problem of conscious reflection. However, in Rosenbloom, Laird, and Newell (1988) it is acknowledged that what is involved in SOAR is a control notion of recursiveness, not an autoepistemic notion such as "quotation, designation, aboutness, or meta-knowledge" (p. 228). Such recursiveness, with perhaps some more convenient than hitherto modularizations of the recursive processes, is in fact all that is involved in universal sub-goaling. SOAR's claims to such phenomena as "reflection," "attention," and "introspection," then, are flagrantly bad metaphorical excesses made "honest" by redefinitions of the terms (into a "control" tradition) in a secondary source paper (Steier at al, 1987; Rosenbloom, Laird, Newell, 1988).

Chunking. The second major innovation in SOAR is the process of Chunking (Laird, Rosenbloom, Newell, 1984, 1986; Laird, Newell, Rosenbloom, 1987; Steier et al, 1987; Rosenbloom, Laird, Newell, McCarl, 1991). Chunking is supposed to constitute a "general learning mechanism." Together with universal subgoaling, then, SOAR has supposedly solved two of the deepest mysteries of the mind -- consciousness and learning. As might be expected, however, there is less here than is first presented to the eye.

Chunking is to a first approximation nothing more than the composition of sequences of productions, and the caching of those resultant compositions. When this works well, appropriate initial conditions will invoke the cached composition as a unit, and save the search time that was involved in the construction of the original sequence of productions. This is useful, but it clearly does not create anything new -- it saves time for what would have ultimately happened anyway. No new representations are created, and no hitherto unconstructable organizations of encodings arise either. Composition of productions is fundamentally inadequate (Neches, Langley, Klahr, 1987).

Chunking's claim to fame, however, does not rest on production rule compositionality alone. In addition, chunking permits generalization in the conditions to which the compositions can apply. Such generalization occurs in two ways. First, generalization occurs via variabilization -- the replacement of identifiers with variables. This makes SOAR "respond identically to any objects with the same description" (Laird, Newell, Rosenbloom, 1987, p. 55). And second, generalization occurs via "implicit generalization" which functions by "ignoring everything about a situation except what has been determined at chunk-creation time to be relevant. ... If the conditions of a chunk do not test for a given aspect of a situation, then the chunk will ignore whatever that aspect might be in some new situation." (Laird, Newell, Rosenbloom, 1987, p. 55).

Both variabilization and implicit generalization are forms of ignoring details and thereby generalizing over the possible variations in those details. This can be a powerful technique, and it is interesting to see what SOAR does with it. But, only identifiers already created by the programmer in slots already created by the programmer can be "ignored" (variabilized), and only aspects of situations (slots) already created by the programmer can be disregarded, and, thus implicitly generalized over. In other words, chunking functions by eliminating -- ignoring -- encodings and encoding slots that are programmer pregiven. Again, nothing new can be created this way, and the generalizations that are possible are completely dependent on the encoding framework that the programmer has supplied.

This dependence of SOAR's "learning" on preprogrammed encoding frameworks holds in two basic senses: 1) There is a nearby outer limit on what can be accomplished with such elimination generalizations -- when everything pregiven has been eliminated, nothing more can be eliminated (generalized over). 2) The generalizations that are available to such elimination methods are completely determined by those preprogrammed encoding frameworks. In other words, an aspect can be generalized over only if that aspect has already been explicitly pre-encoded, otherwise there is nothing appropriate to ignore and thus generalize over. This latter constraint on SOAR's generalization abilities is dubbed "bias": "The object representation defines a language for the implicit generalization process, bounding the potential generality of the chunks that can be learned" (Laird, Rosenbloom, Newell, 1986, p. 31).

Just as the programmer must anticipate all potentially relevant objects, features, relationships, atomic actions, etc. to be encoded in the problem space in order to make SOAR function, so also must the programmer anticipate the proper aspects, features, etc. that it might be relevant to ignore or variabilize, and, thus, generalize over. As a form of genuine learning, chunking is extremely weak. From a representational perspective, the programmer does all the work. To construe this as a "general learning mechanism" is egregious.

Thus, not only is composition per se inadequate, composition plus "generalization" plus "discrimination" (the addition of encoded constraints) are collectively incompetent, for example, for unanticipated reorganizations of encodings, reorganizations of processes, and the construction of new goals (Neches, Langley, Klahr, 1987; Campbell & Bickhard, 1992b). The SOAR architecture, and, ipso facto, any implementation of that architecture, does not escape these failures.

Summary Analysis. SOAR is far from the "architecture for general intelligence" it was touted to be. It cannot generate new representations, so it therefore cannot learn anything that requires representations not already combinatorically anticipated, nor decide anything, nor reason in any way that requires representations not already combinatorically anticipated (e.g., Rosenbloom, Newell, Laird, 1991). Among other consequences, it cannot recurse its problem spaces any further than has been explicitly made available by the programmer's encodings, despite the phrase "universal subgoaling." It cannot "reflect," despite the characterization of subgoal recursion as "reflecting." It cannot generalize in its chunking in any way not already combinatorically anticipated in the user provided encoding scheme for the problem space. SOAR is interesting for some the new possibilities within classical frameworks that it exemplifies and explores, but it cannot manifest any of the capabilities that are suggested by the terms used -- "general intelligence," "reflection," "universal weak method learning," "generalization," and so on. In this respect, it is, at best, a massive example of "natural stupidity" (McDermott, 1981).

The multiple deficiencies of SOAR are not entirely unknown to SOAR's proponents. They are acknowledged in occasional brief passages that are inconsistent with such claims as "general intelligence," "reflection," "general learning," and so on. The deficiencies, however, are invariably treated as if they were mere technical problems, to be conclusively fixed and solved in future elaborations of the system: SOAR

can not yet learn new problem spaces or new representations, nor can it yet make use of the wide variety of potential knowledge sources, such as examples or analogous problems. Our approach to all of these insufficiencies will be to look to the problem solving. Goals will have to occur in which new problem spaces and representations are developed, and in which different types of knowledge can be used. The knowledge can then be captured by chunking.

(Laird, Rosenbloom, Newell, 1986, p. 43).

Not only is the language in which SOAR is presented flagrantly overblown (making claims for SOAR that SOAR has not even touched) but this "faith in principle" in the general approach ("all problems will succumb to more of the same") is the most basic disease of invalid research programmes. SOAR is inherently an instance of the problem space hypothesis, and, a fortiori, of the Physical Symbol System Hypothesis (Norman, 1991). Each of these, in turn, inherently presupposes encodings as the fundamental nature of representation, which entails the impossibility of the emergence of new representation out of non-representational phenomena. But, until genuinely emergent representation is possible (among other things), neither genuine intelligence, nor reasoning, nor problem solving, nor decision making, nor learning, nor reflection will be possible. Any gestures that SOAR might make in these directions will have to be already effectively anticipated in the programmer supplied encodings.

Problem spaces (necessarily pregiven) for the construction of problem spaces might conceivably have some practical value in some instances, but such a notion merely obfuscates the fundamental in-principle issues. Either an encoding framework can successfully anticipate all possibly needed representations or it cannot. The incoherence argument, and related arguments, show that it cannot. And, therefore, since SOAR fundamentally exemplifies the encodingist approach, it is impossible for it or anything within its framework to make good on its claimed aspirations.

Furthermore, and most importantly, the restrictions and impossibilities that encodingism imposes on SOAR and on the problem space hypothesis more generally are simply instances of the restrictions and impossibilities that encodingism imposes on all of Artificial Intelligence and Cognitive Science. The physical symbol system model is simply one statement of the encodingism that pervades and undergirds the field. And it is a fatally flawed foundation.

PROLIFERATION OF BASIC ENCODINGS

Any encodingism yields an ad hoc proliferation of basic encodings because of the impossibility of accounting for new kinds of representation within the combinatoric space of old basic encodings. Encodingism cannot account for the emergence of new representational content; it can only account for new combinations of old contents. The incoherence problem turns precisely on this impossibility of encodingism to be able to account for new, foundational or basic, representational content. Because the emergence of new sorts of encoding elements is impossible, any new representational content requires an ad hoc designed new element to represent it. In relatively undeveloped programmatic proposals, this difficulty can be overlooked and obscured by simply giving a few examples that convey the appearance of being able to reduce representation to combinations of elements -- e.g., the famous case of the "bachelor = unmarried male," or the semantic features proposal for language (Katz & Fodor, 1971; Chomsky, 1965). Whenever such a programme is taken seriously, however, and a real attempt is made to develop it, the impossibility of capturing general representation in an encoding space makes itself felt in a proliferation of elements as more and more sorts of representational contents are found to be essential that cannot be rendered as combinations of those already available (e.g., Bolinger, 1967).

CYC -- Lenat's Encyclopedia Project

Doug Lenat and his colleagues at the Microelectronics and Computer Technology Corporation (MCC) are engaged in a project that directly encounters this problem of the proliferation of basic encodings (Lenat & Guha, 1988; Lenat, Guha, & Wallace, 1988; Guha & Lenat, 1988). They are attempting to construct a massive knowledge base containing millions of encoded facts, categories, relations, and so on, with the intent that the finished knowledge base will define our consensus reality -- will capture the basic knowledge required to comprehend, for example, a desk top encyclopedia. This effort is the enCYClopedia project.

It's All Just Scale Problems. Lenat and colleagues are well aware of the tendency for knowledge bases, no matter how adequate for their initial narrow domains of knowledge, to be fundamentally not just incomplete, but inappropriate and wrongly designed in attempts to broaden the knowledge domain or to combine it with some other domain: Categories and relations are missing; categories are overlapping and inconsistent; categories and relations that need to have already been taken into account, even in a narrow knowledge base, were not taken into account because the distinctions weren't needed so long as that narrow domain was the limit of consideration; the design principles of the knowledge base are inadequate to accommodate the new domain contents and relationships; and so on. Knowledge bases do not scale up well.

The suggestion in Lenat's project is that these problems -- representational proliferation, representational inconsistency and redundancy, design inappropriateness, and so on -- are just scale problems, and, therefore, will be overcome if the scale is simply large enough to start with. The suggestion is given analogical force with an onion analogy: concepts and metaphors are based on more fundamental concepts and metaphors, which are based on still more fundamental ones, like the layers of an onion, and, like an onion, there will be a central core of concepts and metaphors upon which all else are based. These central notions, therefore, will be adequate to any knowledge domain, and, once they are discovered, the scale problem will be overcome and the proliferation problem will disappear. All new concepts will be syntactically derivable from concepts already available, and, ultimately, from the basic "onion core" concepts. The "onion core," then, is supposed to provide the semantic primitives adequate to the construction of everything else (Brachman, 1979).

The Onion is not an Argument. There are at least three problems with the position. The first is that Lenat et al give no argument whatsoever that this will be the case or should in any way be expected to be the case. The onion analogy is the only support given to the hoped for convergence of needed concepts -- a convergence in the sense that, after a large enough base has been achieved (literally millions of facts, categories, etc., they say), the core of the onion will have been reached, and, therefore, concepts and relations etc. needed for new material to be incorporated into the base will already be available. The entire project is founded on an unsupported onion analogy.

There is, in fact, a puzzle as to why this would seem plausible to anyone. We venture the hypothesis that it is because of the intuition that "encodings are all there is" and a similar intuition from the innatists that people are born with a basic stock of representational raw material.

The Onion Core is Incoherent. The second problem is that the presumed core of the representational onion is "simply" the base of logically independent grounding encodings, and the circular incoherence of that notion insures that such an encodingist core cannot exist. From a converse perspective, we note that the layered onion analogy is appropriate to the purely syntactic combinatorialism of encodingism, but that the invalidation of encodingism ipso facto invalidates any such combinatorically layered model of the organization of representation in general.

Combinatorics are Inadequate. The invalidity of the presumed combinatoric organization of possible representations, in turn, yields the third problem: the supposed combinatoric scale problem proves impossible to solve after all. It is not merely a scale problem. New concepts are rarely, if ever, simply combinations of already available encodings, and, therefore, cannot in principle be accommodated in a combinatoric encoding space -- no matter how large the generating set of basic encodings. New representation is a matter of emergence, not just syntactic combination, no matter what the scale might be. The space of possible representations is not organized like an onion.

Note that this is an in-principle impossibility. Therefore, it is not affected by any issues of the sophistication or complexity of the methods or principles of such syntactic combinatorialism. That is, the various fancy apparatuses of exceptions, prototypes, default logic, frame systems with overrideable defaults -- however powerful and practically useful they may be in appropriate circumstances -- do not even address the basic in-principle problem, and offer no hope whatsoever of solving it.

There are several different perspectives on the intrinsic inadequacy of combinatorial encoding spaces. We take this opportunity of the discussion of the CYC Project to discuss four of them, of successively increasing abstraction. The fourth of these perspectives involves technical arguments within logic and mathematics. Some readers may wish to skip or to skim this section (Productivity, Not Combinatorics).

Ad hoc Proliferation. First we must point out again the history of the ad hoc proliferation of encoding types in every significant attempt to construct an encodingism. Both internal to particular projects, such as feature semantics, as well as in terms of the historical development from one project to another, new kinds of encoding elements have had to be invented for every new sort of representational content. In fact, the practical power and realistic applicability of encodingist combinatorialism has proven to be extremely limited (Bickhard, 1980b, 1987; Bickhard, & Richie, 1983; Bolinger 1967; Fodor, 1975, 1983; Shanon, 1987, 1988, 1993; Winograd & Flores, 1986). This, of course, is precisely the history that Lenat et al note, and that they claim -- without foundation -- is merely a scale problem.

Historically False. A second perspective on the inadequacy of combinatorialism is a historical one. In particular, an encounter with the necessity of the proliferation of new sorts of representations can be found in any history of any kind of ideas. New ideas are not just combinations of old ideas, and any such history, therefore, comprises a massive counterexample to any combinatorialism of concepts -- and, therefore, to encodingism. Lenat's knowledge base, more particularly, could not capture the history of mathematics within its onion unless that history were already included in, designed into, the system. This is exactly the point that he brings against all previous knowledge base projects, and the point that is supposed to be overcome because this project is Big. Even more to the point, Lenat's onion will in no way anticipate the future history of mathematics, or any other field of ideas, in its generated combinatoric space.

Another perspective on this point is provided by the realization that encodings cannot capture generalization, nor differentiation, except in terms of the encoding atoms that are already available in the encoding space. Abstraction as a reduction of features, for example, can only proceed so long as the atomic features are already present and sufficient. Differentiation as an intersection of categories, for another, similarly can only proceed in terms of the (sub)categories already encoded. These are just special cases of the fact that encodings cannot generate new representations, only at best new combinations of representations already available.

The history of mathematics, to return to that example, is a history of deep, complex, and often deliberate abstractions from earlier mathematics and from other experience (MacLane, 1986). No combinatoric onion can capture that. To posit that any atomic rendering of Babylonian mathematics would be combinatorically adequate to contemporary mathematics is merely absurd. Why would anyone think that an atomic rendering of today's mathematics, or any other domain, would fare any better in our future? The point remains exactly the same if we shift to Babylonian and contemporary culture writ large: Babylonian culture would have to have contained all contemporary culture (including contemporary mathematics) in the combinatorial space of its encoding representations.

Furthermore, mathematical abstraction is often an abstraction of relations, not of objects or predicates. The relational structures that define groups or fields or vector spaces or lattices (Birkhoff, 1967; Herstein, 1964; MacLane, 1971, 1986; MacLane & Birkhoff, 1967; Rosenfeld, 1968) would be easy examples. Relational encodings cannot be constructed out of element and predicate encodings (Olson, 1987). Therefore, Babylonian mathematics could be combinatorically adequate to modern mathematics only if those critical relational encodings (set theory, category theory, group and field theory per se?) were already present in Babylonian (prehistoric, prehuman, premammal, prenotochord?) times. Modern relational concepts could then be "abstracted" by peeling away whatever festooning additional encodings were attached in earlier times, leaving only the critical relational encodings for the construction of modern conceptions.

Clearly, we are in the realm of a Fodorian radical innatism of everything (Fodor, 1981b).[1] But "the argument has to be wrong, ... a nativism pushed to that point becomes unsupportable, ... something important must have been left aside. What I think it shows is really not so much an a priori argument for nativism as that there must be some notion of learning that is so incredibly different from the one we have imagined that we don't even know what it would be like as things now stand" (Fodor in Piattelli-Palmarini, 1980, p. 269). What is in error about current conceptions of learning is that they are based on false conceptions of representation -- encoding conceptions. Encoding models of representation force a radical innatism, and Lenat is just as logically committed to such an innatism as any other encodingist (Bickhard, 1991c, 1993a). Lenat's onion-core would have to anticipate the entire universe of possible representations.

The Truth Predicate is not Combinatorial in L. The third perspective on the inadequacy of syntactic combinatorialism is a counterexample from Tarski's theorems regarding Truth predicates, as discussed earlier. In particular, any language that is "adequate to its own semantics" is a language in which that language's own Truth predicate can be constructed, and any language which can contain its own Truth predicate is logically inconsistent. An inconsistent language, in turn, cannot contain any coherent capturing of its own semantics, since any statements of semantic relationships can be validly (within the inconsistent language) replaced by their negations. Syntactic combinatorialism is limited to constructions within a given encoding language, and, by these theorems, syntactic combinatorialism is intrinsically incapable of consistently, coherently, defining Truth for that encoding system. The Truth predicate itself, then, for any encoding language, is a straightforward counterexample to any purported adequacy of syntactic combinatorialism to be able to capture the space of possible representations. It is simply impossible.

Productivity, not Combinatorics. Our fourth perspective on the intrinsic inadequacy of any combinatoric encoding space is an abstract in-principle mathematical consideration. Any combinatoric encoding space will be (recursive, and, therefore) recursively enumerable. The set of possible principles of functional selection, on the other hand, and, therefore, of interactive functional representation, will be at least productive (Rogers, 1967; Cutland, 1980).

A productive set is a set S for which there exists a recursive function F such that for any recursively enumerable S1 contained in S with index x, F(x) will be an element in S but not in S1. That is, any attempt to capture a productive set by recursive enumeration yields a recursively computable counterexample to the purported recursive enumeration. The True well-formed formulas of elementary arithmetic are productive in this sense: any purported recursive enumeration of them will recursively yield a counter-example to that purported recursive enumeration.

The basic realization involved here is that interactive representation is intrinsically functional, not atomistic. Any encoding or encoding combination can do no more than influence functional selections in the ongoing process of a system, but the space of possible such functional selections is the space of possible interactive representations, and that space is generated as possible functional organizations that might be selected, not as possible combinations of elements of some finite set of atomic possible selections. New kinds of selections, thus new kinds of representations, can occur given new kinds of functional organizations. There are no atomic representations in this view.

Conversely, a counterexample can be constructed for any given purported encoding enumeration by constructing a new functional organization, and, thus, a new possible representational selection, that differs internally or in some other intrinsic sense from the functional organizations that are selected for by all available atomic "encodings."

In fact, the existence or definability of any productive set constitutes a counterexample to any programme of atomic combinatorialism, since these sets are not non-circularly definable nor constructable as mere combinatorialisms from some, or any, atomic base set -- if they were so definable or constructable, they would not be productive since they would then be capturable by a recursive enumeration. The very existence of productive sets, then, demonstrates that the space of possible forms and patterns of representation-as-functional-selection cannot be captured atomistically, combinatorically, and, therefore, cannot be captured within any encodingism. Productive sets cannot be non-circularly defined explicitly, syntactically, on any atomic base set; they can, however, be defined implicitly.[2]

Any recursive enumeration (encoding model) within a productive set S (space of possible interactive functional representations) yields its own recursively generable element of S (new interactive representation) that is not included in the enumeration (not included in the encoding space). The enumeration, therefore, is not complete. That new element can be included in a new recursive enumeration (e.g., defined as a new atomic element of the generative encodings), which will generate its own exception to that new enumeration (encoding system) in turn. This can yield still another exception to the enumeration, which could be included in still another enumeration, and so on. In other words, it is impossible to capture a productive set by a recursive enumeration, and any attempt to do so embarks on this proliferative unbounded expansion of attempting to capture counterexamples to the last attempted enumeration -- the ad hoc proliferation of encoding types. The enumeration (encoding system) cannot be complete. This futile pursuit of a productive set with an enumeration is the correct model for the relationship between representation and encodingisms. It is far different than, and has opposite implications from, Lenat's onion metaphor.

Isn't Infinite Enough? One prima facie rejoinder to this point would be to claim that, although encodingism suffers from a combinatoric limitation, nevertheless an encoding combinatoric space is infinite in extent, and that ought to be enough. Infinite it may be, but if it does not contain the correct representations, the ones needed for a given task, that does no good. The set of even integers is infinite, but that is of no help if what is needed is an odd integer, or real, or a color or a food, and so on. The possible interactions of a simple finite automaton can also be infinite, but that does not imply that finite automata theory suffices relative to Turing machine theory.

A Focus on Process Instead of Elements. A natural perspective on representation from within an encoding perspective is to focus on the set of possible combinations of basic encoding elements -- on the set of possible encoding representations. This is the perspective in which the above considerations of formal inadequacies of encodingism are presented. A different perspective on representation and encodingism, one more compatible with interactivism, emphasizes the processes involved rather than the sets to which those processes are computationally competent -- competent as enumerators or detectors, and so on. The combinatoricism of encodingism, as a form of process, is clearly drastically inadequate to the formal processes of Turing machine theory. The inadequacy of Turing machine theory, in turn, to be able to capture interactive representation provides still another perspective on the fundamental inadequacy of encodingism. That is, in the interactive view, the potentialities of representation are an aspect of the potentialities of (certain forms of) process -- unlike the diremption of representation from process in the juxtaposition of Tarskian model theory and Turing machine theory, of semantics and computation. Furthermore, interactive representation is an aspect of a form of process that cannot be captured by Turing machine theory, and certainly not by any simple encodingist combinatorialism. The process weakness of encodingism, therefore, constitutes a representational inadequacy relative to interactive representation.

As a set, then, the free space of an encodingism is intrinsically too small, and, as a process, the combinatorialism of an encodingism is inherently too weak. Any attempt to capture representation in an encodingism, then, is doomed to the futile chase of ever more not-yet-included representational contents, is doomed to an inevitable proliferation of basic encoding elements in an attempt to capture representations not included in the prior space. An encoding space is always too small and the combinatoric process is always too weak to be adequate to all possible representations.

Slot "Metaphor" versus Genuine Metaphor. There are yet other problems with Lenat's project. One is that he proposes a model of metaphor as a mapping of slots to slots between frames. This is probably about as much as can be done within a slot-and-frame encoding model, but why it should be taken to be adequate to the creativity of genuine metaphor is not clear and is not argued. Furthermore, the onion analogy itself is rendered in terms of metaphors built on metaphors, etc. down to a presumed core of metaphors. Whatever plausibility this might seem to have derives from the inherent constructive creativity of genuine metaphor: mappings of slots, on the other hand, require slots (and frames) that are already present. The onion, therefore, is not only an analogy in lieu of an argument, it is an analogy built on a fundamental equivocation between genuine metaphor and the impoverished notion of "metaphor" as slot to slot mappings.

This is an implicit circularity in Lenat's onion: the onion analogy is used to make the scope claims of Lenat's encodingism project seem plausible, but the onion model itself is made plausible only in terms of the layering of metaphors, and Lenat's encoding model for those metaphors, in turn, -- slot-to-slot mappings between frames -- presupposes the validity of the general encoding approach that the onion metaphor was supposed to make plausible in the first place. Lenat's onion is hard to swallow.

Contradiction: Does Pushing Tokens Around Suffice, or Not? Still another problem in this project is that, although there is much discussion of the fact that the semantics of knowledge bases are in the user, not in the system -- a point we clearly agree with -- there is later a discussion of the massive knowledge base in this project as if it would understand, would have its own semantics. This issue is addressed, though hardly in a satisfactory way. In fact, "Yes, all we're doing is pushing tokens around, but that's all that cognition is." (Lenat & Guha, 1988, p. 11) The basic claim is that somehow by moving to the massive scale of this knowledge base project, the tokens inherently acquire semantics for the system, not just for the user. As with the proliferation problem, sufficient scale is supposed to solve everything in itself. The magic by which this is supposed to happen is not specified.

Accountability? The only discernible consequence of these incantations of massive scale is that the project cannot be accountable for any of its claimed goals till sufficient scale (itself only vaguely specified) has been reached, which means, of course, until massive amounts of time and money have already been spent. Here, as elsewhere in the CYC Project documents, the glib and breezy style seems to have dazzled and confused not just the intended readers, but the authors as well.

Claim: Dis-embodied, Un-situated, Un-connected Intelligence. Lenat and Feigenbaum (1991) provide a somewhat more sober presentation of the general project and strategy. But, although the tone is more sober, the claims are not. 3 There is still the explicit assumption that scale of knowledge base is what is of ultimately fundamental importance. In fact, there is an explicit hypothesis that no as-yet-unknown control structure is required for intelligence (1991, p. 192). Interactive control structures would seem to constitute a counter example to that, but they explicitly reject the notion that representation requires epistemic systems that are embedded in their environments. Action and interaction are not epistemically important. In later usage, in fact, they employ the notion of control structure as being synonymous with inference procedure (1991, p. 233); they don't have in mind any sort of real connection with the world even here.

Contradiction: The Onion Core is Situated, Embedded, Practices -- It is NOT Dis-embodied Tokens After All. There is still the claim that the layers of analogy and metaphor "bottom out," though the onion per se is absent. The claim, however, is still unsupported (1991, p. 201). Later, in reply to Smith's critique (1991), they offer Lakoff and Johnson (1980) (1991, p. 246) as support for the existence of such a core to a universal conceptual onion. To the extent that this powerful book could be taken as supportive of any such core, however, any such support is dependent not only on the genuinely creative sense of metaphor -- not capturable in mappings among pre-created slots -- but also on the "core" being the sensory-motor practices of situated, embedded, human beings. The "core" that is being offered here in lieu of support for an onion core of combinatorically adequate representational atoms is instead a "core" of practices and forms of action and interaction upon which higher level metaphors are and may be created; it is not a "core" of grounding context independent encodings. As in the case of the original onion, only with a slippery inattention to fundamentals is any superficial appearance of support or supportive argument presented.

Butchering Piaget. Although it is a small point in itself, it is also worth pointing out that this inattention and carelessness with claims is manifested in a flagrantly bad construal of Piaget -- e.g., of Piaget's stage theory (p. 203, 204; cf. Bickhard, 1982, 1988a, 1988b; Bickhard, Cooper, Mace, 1985; Campbell & Bickhard, 1986; Drescher, 1986, 1991; Kitchener, 1986; Piaget, 1954, 1971, 1977, 1985, 1987). Pragmatic breeziness covers a multitude of sins.

False Programmes are not Falsifiable. Lenat and Feigenbaum (1991, p. 204) claim an advantage of falsifiability for their general approach. They present a commonsensical "let's try it and if it's falsified, then we'll learn something from that too" approach (e.g., p. 211). Unfortunately, they're never clear about what would constitute falsification (and the time horizons for any possible such falsification keep getting pushed back; their 1991 article projects into the next century). More deeply, they seem unaware that research programmes cannot be empirically falsified, even though they may be quite false. If the interactive critique is correct, then this entire project is based on false programmatic presuppositions. But, granting that they might conclude somehow that CYC had been falsified, on what would they place the blame? How would they diagnose the error? Perhaps CYC needs to be still bigger, or have more kinds of slots? Only conceptual level critique can discover and diagnose programmatic failure, but they are quite skeptical and derisive about such "mysticism" and "metaphysical swamp[s]" (p. 244; also, e.g., pp. 236-237).

Inconsistency: Claim Foundational Advances -- Reject Responsibility. Lenat and Feigenbaum claim that a completed CYC system will actually have concepts and know things about the world (many places; e.g., pp. 244, 247), and yet they also reject pursuing the very issue of how concepts relate to the world (pp. 236-237). It seems that that issue is just another part of the metaphysical swamp. But it can't be both ways: Lenat and Feigenbaum cannot consistently claim solutions to foundational problems, and yet reject foundational critique. A careless "whatever works" becomes irresponsible when it both makes basic claims and rejects the very domain of the critique of such basic claims. There are not only a multitude of sins here, but very serious ones too.

Smith's Critique. Smith (1991) presents a strong critique of Lenat's project that has a number of convergences with our own. He, too, notes the absence of argument and of serious consideration of the deep and long standing problems involved. He also notes the absence of any system semantics, in spite of apparent claims and presumptions to the contrary. Smith offers a number of critiques that, in our view, turn on the naive encodingism of this project, including the exclusive focus on explicit rather than implicit representation, and the absence of contextual, use, agentive, action, situatedness, or embodiment considerations. We wish only to second these criticisms, and to point out that these considerations are intrinsic aspects of interactive representation that are inevitably ruptured in the move to encodingism. They can at best be tacked on to an encodingist model to make an incoherent and ataxic hybrid. Lenat doesn't even attempt the hybrid.

-----------------------------------------------------------

Rodney Brooks: Anti-Representationalist Robotics

Brooks proposes a radical shift in approaches to the construction of artificial intelligence (1990, 1991a, 1991b, 1991c, 1991d). He suggests that the problem of artificial intelligence has been broken down into the wrong subproblems, and in such a manner that it has obscured the basic issues and approaches to solution. He proposes that intelligent systems be constructed incrementally, instead of componentially, starting with simple intelligences and working toward more complex instances, with each step constituting a full intelligent creature "in the real world with real sensing and real action." (1991a, 140)

In following this approach, Brooks and colleagues have arrived at an unexpected conclusion and a radical hypothesis:

Conclusion: "When we examine very simple level intelligence we find that explicit representations and models of the world simply get in the way. It turns out to be better to use the world as its own model."

Hypothesis: "Representation is the wrong unit of abstraction in building the bulkiest parts of intelligent systems." (1991a, 140)

In fact, Brooks suggests that "Representation has been the central issue in Artificial Intelligence work over the last 15 years only because it has provided an interface between otherwise isolated modules and conference papers." (1991a, p. 2)

Subsumption and Evolution. In the construction of intelligent systems, Brooks advocates a layered approach in which lower layers handle simpler behaviors and higher layers handle more complex behaviors, generally through influencing the activity of the lower layers -- the higher layers subsume the lower layers. By getting the lower layers to work first, debugging and correcting the next layer up is enormously simplified. Brooks is proposing to model his engineering approach on evolution (1991a, p. 141), both in the construction of individual intelligent creatures, and in the design and construction of new intelligent creatures. (He notes that the problems that evolution took the longest to solve -- those of basic real world interaction -- are precisely the ones that standard Artificial Intelligence deliberately ignores.) The successive layering that results recapitulates both evolution and physiological design and maturation (Bickhard, 1992b).

With the extremely important caveat that "representation" in Brooks' discussion be understood as "symbolically encoded representation," we are in enthusiastic agreement with Brooks. In fact, his analysis of "representation" emerging only because it is needed for the interface between otherwise isolated modules is virtually identical with the interactivist analysis of the emergence of subsidiary encodings and associated processing systems out of underlying interactive systems (Bickhard & Richie, 1983). There is a significant difference in emphasis, however, in that the interactivist analysis is of how and why such differentiated modules and encoding interfaces might evolve and be functional in the context of and in the service of an already functioning non-encoding interactive system. That is, might evolve and emerge in the service of exactly the sort of intelligent systems that Brooks proposes to create.

Robotics. Robotics has at times been seen as "simply" a subdivision of Artificial Intelligence. Brooks' proposals in effect reverse that. He suggests that, by virtue of having to interact with the real world, robotics in fact encounters the fundamental problems of intelligence, and, conversely, that the isolated module approach standard in Artificial Intelligence cannot do so. This is also exactly the point argued in Bickhard (1982). It is clear that robots are necessarily interactive, while standard Artificial Intelligence is deliberately not.

Robotic Representations. There is, of course, one major difference between the interactivist approach and the position taken by Brooks: by interactivist criteria, at least some of Brooks' intelligent creatures already involve primitive representations. He acknowledges that there might be some sense in which representations are involved "implicitly" (1991a, p. 149), but argues that what goes on in such creatures is simply too different from "traditional AI representations" (1991a, p. 149) and "standard representations" (1991a, p. 149) to be considered as representations. This point is quite correct in its premises, but by thereby dismissing the possibility that intelligent creatures do involve representations in some real, but non-standard sense -- an interactivist sense -- Brooks inhibits the exploration of that aspect of his project.[4] Interactivism, in fact, argues that representation first emerges in exactly the "implicit" sense that Brooks reluctantly and indirectly acknowledges, with all instances and forms of explicit representation -- "standard representation" -- emerging from, and remaining subsidiary to and dependent upon, such interactive "implicit" representation.[5]

Brooks (1991d) provides a first step in this interactivist direction when he proposes an "inverse" perspective on the function of sensors. In the standard perspective, in which the focus is on inferring the correct world state on the basis of sensor readings, the designer focuses on "[a] particular world state and then analyz[es] all possible sensor readings that could be generated" (p. 438). In Brooks' inverse approach, the designer proceeds to consider "given a sensor reading, ... which possible worlds could have given rise to that reading" (pp. 438-439).

The relationship between a sensor reading and the "possible worlds [that] could give rise to that reading" is precisely that of interactivism's implicit definition (Bickhard, 1980b, p. 23 -- see also Bickhard, 1992c, 1993a, in preparation-c; Bickhard & Campbell, 1992, in preparation; Campbell & Bickhard, 1992a). Four additional steps are required to arrive at the interactive model of representation:

1) Recognition that this relationship is only implicit from the perspective of the system itself;[6]

2) Generalization of such implicit definition from strictly passive sensory input receptors to interactive (sub)systems;

3) Recognition that this implicit definitional relationship does not provide an emergence of representational content for the system itself; and

4) The addition of the implicit predication of further interactive properties as providing that emergence of content.

-----------------------------------------------------

Dynamic Systems Approaches

There has been a growing appreciation of dynamic systems approaches in recent years (Beer, in press-a, in press-b; Hooker, in preparation; Horgan & Tienson, 1992, 1993, 1994; Maes, 1990a, 1991, 1992, 1993, 1994; Malcolm, Smithers, & Hallam, 1989; Steels, 1994; Port & van Gelder, in press). The possibilities of analysis in terms of phase space dynamics, of open systems, of integrated system-environment models, and of non-linear dynamics -- such as attractors, bifurcations, chaotic phenomena, emergent behavior, emergent self-organization, and so on -- have excited a number of people with their potential modeling power. Dynamic systems approaches offer promise of being able to model -- and to model naturalistically -- many phenomena, including emergent phenomena, that have remained inexplicable within alternative approaches. Dynamic systems approaches can also model processes that are uncomputable in standard frameworks (Horgan & Tienson, 1992, 1993, 1994). The interactive model shares the goal of a thorough and consistent naturalism, and arises in the framework of dynamic open systems analysis (Bickhard, 1973, 1980a, 1980b, 1992c, 1993a; Bickhard & D. Campbell, in preparation). There are, therefore, important convergences with much of the work in this area. There are also, as might be expected, differences -- especially with respect to issues of representation per se.

Cliff Hooker. Hooker (in preparation) proposes a naturalist philosophy of the natural sciences of intelligent systems. He advocates a process theoretical approach -- specifically, the organizations of processes that constitute complex adaptive self-organizing systems -- toward a naturalized theory of intelligence. Hooker, Penfold, & Evans (1992) present a novel architectural approach to control theory -- local vector (LV) control -- and explore issues of problem solving, concepts, and conceptual structure from within that framework. Their aim is to show how LV control "may be able to illuminate some aspects of cognitive science" (p. 71).

Hooker (1994) develops a model of rational thought within a dynamic systems framework. He advocates a consistent naturalism -- including a naturalism of rationality -- and explores the contributions toward, and errors with respect to, that goal in literature ranging through Popper, Rescher, and Piaget. In the course of this analysis, Hooker develops a model of a non-foundationalist evolutionary epistemology (D. Campbell, 1974). His discussion convincingly demonstrates the plausibility of a naturalism of rationality. Hooker's is a truly kindred programme.

Concerning naturalism, for example: If mind is not a natural part of the natural world, then Artificial Intelligence and Cognitive Science have set themselves an impossible task. The commonalities regarding dynamic, interactive, systems as the proper framework to explore are clear. Although rationality per se is not much discussed in this book, the interactive model of rationality emphasizes knowledge of possible errors (Bickhard, 1991d, in preparation-a), while Hooker emphasizes successful autoregulations. Ultimately, both error knowledge and autoregulation knowledge must function together; in many ways, these are complementary projects.

Tim van Gelder. Van Gelder (in press-a) points out that, from a dynamic systems perspective, standard computationalism consists of a severely limited subspace of possible kinds of dynamics -- and a space of limited power relative to the whole space. He advocates moving from these limited visions of possible systems to a consideration of how input-output relationships, and feedback relationships, might be best modeled or best accomplished within the entire space of dynamic possibilities. The emphasis on general system dynamics and on feedback is strongly compatible with the interactive framework, but van Gelder, like Brooks (1991a) and others, avoids issues of representation. Any complex interacting system, however, will have to contend with multiple possible next interactions, and will have to select among such possibilities on the basis of indicated outcomes. Appropriate forms of such indications of interactive possibilities constitute interactive representation. We argue that ignoring such emergent possibilities within system organization can only weaken the overall approach: it ignores the power of representation.

Randall Beer. Beer (1990, in press-a, in press-b; Beer, Chiel, & Sterling, 1990) proposes adaptive systems as a framework for analysis and design of intelligent systems. He proposes, in fact, the notion that intelligence is adaptive behavior. If so, then the design and understanding of intelligent systems must recognize them as being intrinsically embodied as some sort of agent that, in turn, is intrinsically embedded in an environment. Embodiedness is required for a system to be even potentially adaptive in its behavior, and environmental embeddedness is required for the notion of adaptiveness to make any sense.

Autonomous symbol manipulation is not an appropriate framework for such analyses. The symbol manipulation framework for Artificial Intelligence and Cognitive Science is neither embodied nor embedded. In fact, as discussed above, standard frameworks cannot account for any epistemic contact with an external world at all (Bickhard, 1993a; Fodor, 1981a; Shanon, 1993). Beer, accordingly, eschews standard representations.

Instead, he begins with relatively simple versions of adaptive systems -- insects, in this case -- and attempts to capture some of their interesting and adaptive behavior in artificial creatures. Specifically, investigations have addressed such phenomena as locomotion, exploration, feeding, and selections among behaviors. The focus is on the physical structure of the simulated insect agent, and on the artificial nervous system that underlies its dynamics of interaction. Beer makes "direct use of behavioral and neurological ideas from simpler animals to construct artificial nervous systems for controlling the behavior of autonomous agents." (1990, p. xvi). He has dubbed this approach "computational neuroethology" (see also Cliff, 1991). The rationale for computational neuroethology is a strong one: Nature has usually been smarter than theorists.

In further development of the computational neuroethology approach, Beer & Gallagher (1992; Gallagher & Beer, 1993) have demonstrated the evolution in neural nets, via a genetic algorithm, of the ability to engage in chemotaxis -- movement toward a chemical concentration -- and to control a six legged insect walker. Note, these are not just nets that can control chemotaxis and six-legged walking per se, but nets that develop that ability. Yamauchi & Beer (1994) present a similar ability for a net to learn to control sequences of outputs, and to switch between appropriate sequences when conditions of adaptiveness change.

On a more programmatic level, Beer (in press-a) argues against computationalism, including against correspondence notions of representation. He urges dynamic systems as an alternative programme, but with no focus on issues of representation in dynamic systems. Beer (in press-b) addresses a wide range of programmatic issues. He discusses the importance of embodiment for understanding and designing agents, in contrast to the disembodied agent perspective that is typical of Artificial Intelligence, and of the inherent dynamic coupling of an embodied system with its environment (Bickhard, 1973, 1980a). He provides an introduction to the notions of dynamic systems analysis; provides an overview of the insect walker; and argues again against the notion that internal states, even internal states correlated with something in the environment, constitute representations.

Beer (in press-b) also raises an issue that we consider to be fundamental, but that is seldom addressed. What constitutes system survival in dynamic systems terms? The question is intimately connected to issues of how to model adaptiveness, and, therefore, of how to model the emergence of function and representation -- notions that derive from survival and adaptiveness respectively (Bickhard, 1993a; Bickhard & D. Campbell, in preparation). Beer's answer is, to a first approximation, that persistence -- survival -- of a system is equivalent to the persistence of crucial limit cycles in its dynamics. He then proposes that this conception might be generalized in terms of the notion of autopoiesis (Maturana & Varela, 1980, 1987). These conceptualizations of survival and adaptiveness are strikingly similar to those in Bickhard (1973, 1980a), where a notion of system stability is defined in terms of the persistence of a condition of the reachability of system states. It is not clear that the conceptualizations are fully equivalent, but the first focus should be on the importance of the questions rather than on the details of the proposed answers.

Try Simpler Problems First. Complex problems, such as that of intelligence, are often broken down into simpler problems, or are addressed in terms of simpler versions of the problem, in order to facilitate investigation. The notion of "simpler problem," however, is quite different in a dynamic systems approach than in standard frameworks for investigating intelligence. Micro-worlds, restricted knowledge domains, chess playing, and so on, are examples of what counts as simpler problems from a symbol manipulation perspective. They are "simpler" in the sense that they do not commit to the presumed full symbol store and manipulation rules of a presumed fully intelligent entity. They are partial symbol manipulation "intelligences" in the sense that they involve partial symbol domains and partial sets of manipulation rules.

But none of these are agents at all, in any sense of the notion of agent. And most certainly they are not adaptive agents. These approaches suppose that they are cutting nature at its joints -- joints of symbol encoding domains -- but, if intelligence cannot be understood except in adaptive agent terms, then such attempts have utterly missed the natural joints of intelligence. They are putting a giant buzz-saw through the house of intelligence, severing a piece of the living room together with a piece of the kitchen, and taking bits of chairs, tables, and other furniture along with it (or worse, a buzz-saw through the wood pile out back). That is no way to understand houses -- or intelligence.

More carefully, encodingism is studying the placement, configuration, and color patterns of -- and how to operate -- the switches and controls on all the appliances and fixtures in the house. It misses entirely the manner in which stoves, faucets, air-conditioners, television sets, incandescent bulbs, and so on function internally, and at best hints at the more general functions such as cooking and eating, plumbing, lighting, and climate control that are being served. Encodingism provides not a clue about the existence or nature of electricity or water flow or air currents and properties, and so on. Encodingism is a static model of representation. Encodings can be manipulated in time, but an encoding relationship per se is atemporal -- just like a light switch can be manipulated in time, but a switching relationship per se is atemporal. The crucial aspects are the functional dynamic aspects, and encodingism does not even address them. This is still no way to understand houses, or representation -- or intelligence.

If intelligence is adaptive behavior, then "simpler" means simpler versions of adaptive agents and their behavior (Brooks, 1991c; Maes, 1993). The natural joints do not cut across the interactive dynamics between agent and environment, but, instead, divide out simpler versions of such dynamics -- simpler versions such as Beer's artificial insects, or Brooks' robots, or Kuipers' critters.

Representation? Rather clearly, we are in fundamental agreement with Beer's dynamic and adaptive systems approach (Bickhard, 1973, 1980a, 1993a; Bickhard & D. Campbell, in preparation). For many reasons, this is a necessary framework for understanding mind and intelligence. We are also in agreement with his rejection of symbol manipulation architectures as forming a satisfactory encompassing framework for that understanding. Beer also notes that the internal dynamics of his insect do not match the connectionist notions of representations. Although we regard connectionism as a potential, though currently somewhat misguided, ally, the misguidedness is precisely in the notion of representation that connectionists have adopted (see discussion below), so, again, we are in full agreement with Beer.

In focusing on and correctly rejecting contemporary notions of representation, however, Beer has overlooked the possibility that there might be a more acceptable, and compatible, model of representation. We argue that the interactive model of representation emerges naturally from dynamic adaptive systems perspectives (see below, and Bickhard, 1993a; Bickhard & D. Campbell, in preparation), and is not subject to the myriads of fatal problems of encodingism -- whether in symbol manipulation or in connectionist versions. If so, then representation -- properly understood -- has a central role to play in analyzing and designing intelligent systems.

Robotic Agents. The move of roboticists toward systems dynamics more general than the classic input-processor-output architectures is becoming more and more widespread (Brooks, 1991b; Maes, 1990a, 1993; Malcolm, Smithers, & Hallam, 1989). We have discussed Brooks subsumption architectures, and his associated rejection of the notion of representation, and Beer's insects and neuroethology. The possibility -- and the possible usefulness -- of representation, however, remain a point of difference among roboticists.

Adaptive Agents. In general, approaches to the design of adaptive autonomous agents de-emphasize representation (Maes, 1993, 1994). There is "little emphasis on modeling the environment" and "no central representation shared by the several [functional] modules" (Maes, 1994, p. 142). In fact, there tends to be a de-centralization of control into multiple interacting functional modules, with at best very local "representations" resident in each module sufficient to the tasks of each module. Brooks' theme of "the environment is its own best model" is emphasized, and what "representations" do occur are much less likely to have the "propositional, objective, and declarative nature" typical of classical Artificial Intelligence. Maes (1993, 1994) provide overviews of the field; Maes (1990a) provides a number of examples.

One major approach emphasizes the close relationships between adaptive agents in the broad sense and living systems. This emphasis is often in terms of the behavioral problems that such agents encounter, whether living or designed, and in terms of inspiration from solutions found in living systems for subsequent artificial design (Beer, 1990; Cliff, 1991; Steels, 1994). In this view, general models of adaptive autonomous agents constitute a framework for approaching the design of artificial intelligence via the design of artificial life (Steels, 1994). As is by now familiar, issues of representation are usually either rejected or ignored, though almost always in terms of correspondence notions of representation. But questions concerning the adequacy of dynamic systems frameworks to issues of cognition and representation cannot be postponed forever (Steels, 1994).

Exceptions do exist in which standard notions of representation are rejected, but, instead of concluding thereby that representation plays no role, alternative conceptions of representation are adumbrated. In keeping with the focus on interactive robots and open living systems, these alternative notions of representation tend to have a strong interactive flavor. Nevertheless, specifics of the distinction between epistemic contact and representational content, implicit definition and predication, emergent system detectable truth value, and so on, are absent. Intuitions that knowledge is fundamentally a matter of ability to maintain system integrity can be strong (Patel & Schnepf, 1992; Stewart, 1992 -- see Bickhard, 1973, 1980a, 1993; Bickhard & D. Campbell, in preparation), but where to go from there is not so clear. Proposals can end up with some more complex version of input processing correspondences, in spite of the basic intuitions.

Tim Smithers. Malcolm & Smithers (1990) actively explore novel and hybrid architectures, with a dynamic systems framework as an underlying inspiration. Notions of representation are based on interactive robotic submodules, but interpretation of representations is still primarily from the perspective of an observer -- a designer or user. Smithers (1992) rejects folk psychology models of intelligence as being not useful in the design of intelligent systems -- notions of belief and desire have demonstrated themselves to be unhelpful. He advocates a dynamical systems approach instead. Smithers (1994) refines such a dynamical systems approach, proposing that agent-environment systems be approached within the conceptual framework of nonlinear dissipative dynamical systems theory. Smithers points out the necessity for considering situated, embodied agents -- and agents for which history counts -- and outlines a systems approach for doing just that.

Nehmzow & Smithers (1991, 1992) present robots that self-organize maps based on their sensory-motor experience. Under certain conditions, these maps will begin to capture the basic physical layout of the environment in terms of its interactive potentialities. But to do so requires progressively finer differentiations with respect to the robot's interactions so that those differentiations succeed in discriminating differing physical locations. With inadequately fine differentiations, for example, two corners that "look alike" in immediate sensory-motor interactions will not be discriminated. This process of differentiating the environment in terms of interaction paths and histories, and the future "histories" that those past histories indicate as being possible, is the fundamental form of interactive representation. (See also Kuipers, 1988; Kuipers & Byun, 1991; Mataric, 1991.)

Emergent Action Selection. One of the important themes of this orientation is the emergence of behavior from multiple loci of control (Steels, 1991, 1994; Maes, 1993, 1994). There need not be any central controller or planner for adaptive behavior to occur (Beer, 1990; Brooks, 1991a; Cherian & Troxell, 1994a, 1994b, in press; Cliff, 1992; Pfeifer & Verschure, 1992). Multiple interacting and competing sources of behavioral control can generate complex adaptive behavior emergent from those interactions and competitions.

Maes (1991, 1992) describes an architecture that can emergently select actions and action sequences appropriate to various goals, and that can learn such selections. This moves into the critical realms of motivation and learning. As discussed in the section on Benny Shanon, motivation as action selection is an intrinsic aspect of the interactive model -- issues of motivational selections and issues of representation cannot be separated except as aspects of one underlying process organization.

Wade Troxell and Sunil Cherian. Again, in such multiple-module action selection, representation, at least in its classical symbol manipulation sense, is de-emphasized or rejected. "No explicit, isomorphic mapping is performed from objects and events in some real external world to internal system states. The view that `intelligence is the process of representation manipulation' is rejected in favor of strategies that place more emphasis on the dynamics of task-directed agent-environment interactions." (Cherian & Troxell, 1994a).

If representation is an emergent of interactive system organization, however, as the interactive model argues, then robotics, and dynamic systems investigations more broadly, are the natural fields in which issues of representation ultimately cannot be avoided, and in which genuine representations will be naturally emergent. Representational design issues are robotics issues, not computational issues more narrowly (Bickhard, 1982). Questions of representation can be raised, and have been addressed voluminously, within a computational or a connectionist framework, but they cannot be answered without moving to a full interactive system framework.

Cherian & Troxell (in press) propose just such an interactive framework for exploring issues of knowledge and representation in autonomous systems. They reject a structural encoding model of knowledge in favor of a notion of knowledge as the ability to engage in successful interaction with an environment. In this interactive sense, knowledge is constituted in the organization of the system's control structure, not in static encoding structures. Knowledge as constituted in interactive control structures does not require an interpreter of encodings; therefore, it does not encounter the fatal aporia of encodingism. They offer a formal description for interactive control structures, and an example application.

Modeling knowledge and representation as properties of interactive competence is inherently natural for robotics, once the mesmerizing seductions of encodingism are overcome. We argue that it is the only viable approach not only for robotics per se, but for Artificial Intelligence and Cognitive Science more broadly -- and for psychology and philosophy. Embodied autonomous interactive dynamic systems are not just an application aspiration of Artificial Intelligence, they are the essential locus of emergence, understanding, and design, of representation, knowledge, and intelligence -- of mind. They are the locus of resolution of the fundamental programmatic aspirations of Artificial Intelligence and Cognitive Science.

Representation versus Non-representation. The debate about the role of representation in dynamic systems approaches has construed representation in the classical encodingist sense. The issues debated are whether dynamic systems do, or need to, involve internal states that track external objects, events, properties, and so on, such that that tracking is an important aspect of the explanation of the functioning of the system (Cherian & Troxell, in press, and Hooker, in press, are rare exceptions). Some argue that the dynamic couplings between system and environment are such, or can be such, that representational construals are not useful. Others argue that they are useful (e.g., Horgan & Tienson, 1992, 1993, 1994). Clark and Toribio (in press) point out that representational construals are to be expected to be important primarily in "representation hungry" problem domains, by which is meant:

1) The problem involves reasoning about absent, non-existent, or counterfactual states of affairs.

2) The problem requires the agent to be selectively sensitive to parameters whose ambient physical manifestations are complex and unruly (for example, open-endedly disjunctive). (p. 16, manuscript)

These conditions virtually require that something in the system track the relevant states of affairs or parameters. We agree that such "representation hungry" problem domains exist, and that they will generally require environmental tracking of the sort that Clark and Toribio argue for.

Such tracking, however, can be accomplished in ways that do not look much like standard notions of elements-in-representational-correspondences. The tracking of correspondences can be accomplished not only by states and limit cycles, but also by hyperplanes or even more general manifolds in the dynamic phase space. The system entering one of several alternative such manifolds could constitute the tracking or "recording" of some condition -- a correspondence between dynamic manifold and condition rather than state and condition. It is not clear that such "tracking by manifold" satisfies standard intuitions about representation.[7]

More deeply, however, we disagree that any such tracking will constitute representation. Interestingly, this disagreement extends to both sides of the issue, because those arguing against the usefulness of representation in dynamic systems approaches use much the same notion of representation.

At times, additional restrictions are assumed by those arguing against representation, such as that the representations be some version of standard explicit syntactic symbols. With such a notion of representation, it is correct that dynamic systems approaches need not necessarily involve symbolic representation, at least in most cases. Caveats might be necessary for the arguments, for example, concerning systematicity in thought (Fodor & Pylyshyn, 1981; see, however, Clark, 1993; Hooker, in preparation; Niklasson & van Gelder, 1994; Port & van Gelder, in press; and the discussions of connectionism below). This stance, exemplified by many of the researchers in dynamic systems approaches, that symbol manipulation approaches are not needed or are even detrimental, constitutes a clear rejection of the classical symbol manipulation framework -- a rejection with which we are strongly in agreement. The tracking notions of representation, however, are weaker models of representation, and, therefore, stronger models upon which to base a claim of representation in dynamic systems (cf. Touretzky & Pomerleau, 1994; Vera & Simon, 1994).

Representations as Correspondences? The reasons for our disagreement with this assumption (common to both sides of the representational versus non-representational argument), that representations are encoding correspondences, is by now obvious. We argue that such tracking states do not constitute representations, so, even if they are required in some conditions, that still does not amount to a requirement for representation. Clark and Toribio (in press) and Clark (1994) argue for such tracking notions of representation as constituting the best model of representation available. They acknowledge that this provides at best an "external" notion of representation, dependent on the interpretations and explanations of observers and analyzers, but claim that "internal" notions of representation fall to homunculus problems -- they require internal homunculi to interpret the internal representations. We agree with this criticism in general, but not with the assumption that there is no alternative notion of internal representation that is not subject to this criticism.

Observer Representations. "External" notions of representation become rather unsatisfactory -- the intrinsic circularity or infinite regress become obvious -- when considering the representations of those external observers themselves. "External" encodingism avoids internal interpretive homunculi only at the cost of external interpretive homunculi, and the promissory note issued by those homunculi cannot be cashed externally any more than internally. The fact that there do exist some external interpretive homunculi -- people, for example -- unlike the non-existence of internal interpretive homunculi, does nothing toward accounting for those external interpreters. Accounting for human intentionality and intelligence was the problem to be addressed in the first place, and it cannot be addressed simply by reference to other -- "external" -- intentional and intelligent agents.

Conditions that Require Representations. On the other hand, we would argue that there are problem conditions and system conditions that do require representations -- even in dynamical systems, even in the full interactive sense of representation. These conditions follow directly from the interactive model itself. If a system faces more than one possible next action or course of interaction, and must select among them on the basis of indicated internal outcomes of those interactions, then those indications constitute interactive representation -- capable of being false and falsifiable internal to the system itself.

Not all interaction selections will necessarily be based on indications of the internal outcomes of the to-be-selected interactions. If the environment is sufficiently anticipatible, and the system requirements sufficiently stable, then interaction selections might be simple evocations -- e.g., triggerings -- by internal system state: In certain internal system conditions this interaction is selected, while in other conditions that interaction is selected, with no need for indications of subsequent internal outcomes. If the selections of interactions can be strictly feed-forward and informationally ballistic (Bickhard, 1980b) in that sense, then interactive representation need not be present.

This is so even if those selections are based on internal states that factually track external conditions. Such tracking is a functional and factual state of affairs, one that may even be necessary to the survival of the system, but note -- once again -- that nothing about the existence or success or failure of that tracking per se is available to the system itself. If there is no possibility of system representational error, then there is no system representation.

Interactive representation is required, then, when the processing in the system must be potentially controllable, at least in part, by system error in achieving its indicated internal outcomes. This will occur when the conditions and interactions that will yield success or failure are uncertain, and therefore the possibility of such error must be functionally taken into account -- for example, in goal-directed systems.

Such uncertainty of outcome, in turn, could result from a randomness of environmental response, or a complexity of relevant environmental conditions that is too great to be detectable in reasonable time (and other resource costs). Another source of such uncertainty would be a system that is sufficiently complex that many of its interactions are novel for the system. For such a system, even if the environment were stable and simple enough that ballistic actions with no error feedback would be possible in principle, the system would nevertheless not be able to anticipate its interaction outcomes because it would not have had sufficient prior experience with its novel interactions, and would, therefore, have to be able to take error into account. In other words, the uncertainty that would require outcome indications -- interactive representation -- can be a property of the environment, or a property of the system's interactive knowledge about that environment.

System generated error is required when system implicit anticipations of the courses and outcomes of interactions cannot be assured. Of course, system generated error, once available, might be useful and used for many conditions in which it is not strictly necessary, but is less costly than alternatives, as well as in conditions in which it is required.

One critically important version of system error guided processes, of course, is that of goal-directed interactions (Bickhard, 1980b). Another is that of general learning processes. Learning cannot be fully successfully anticipatory -- if it were, there would be nothing to be learned. Learning must involve the possibility of error, and such error must be functionally detectable by the system itself so that the learning can be guided by it (Bickhard, 1980a, 1992a, 1993a, in preparation-c; Campbell & Bickhard, 1986; Drescher, 1991 -- see the discussion of learning below).

So, in problem domains that involve sufficient uncertainty of environmental response to -- and, therefore, of system outcome of -- system-environment interactions, system detectable error can be necessary. System detectable error of anticipated (indicated) system internal outcomes is interactive representation. Dynamic system approaches to system-environment couplings that involve uncertainty of the course of the interactions, then, can require interactive representation in the system in order to be competent to functionally respond to the inevitable error.

The Emergence of Function. This discussion, of course, relies on a notion of something counting as error for a system. Internal indications of the internal outcomes of interactions provide a system detectable condition of whether or not those indicated conditions obtain, but what is to count as an emergence of success and failure here? Why would achievement of indicated conditions, for example, count as success rather than failure? Or, under what conditions would such achievement count as success and under what conditions as failure?

The notion of failure here is that of functional failure. The representational emergence involved is out of a functional level of analysis. But functional failure too must be naturalized -- it too must be naturally emergent, with no dependence on the external interpretations of observers, users, and so on. There is a rich literature of approaches and problems concerning the naturalization of such notions of function and dysfunction (e.g., Bechtel, 1986; Bigelow & Pargetter, 1987; Boorse, 1976; Cummins, 1975; Dretske, 1988; Millikan, 1984, 1993; Neander, 1991; Wimsatt, 1972; Wright, 1973).

We propose to derive a model of emergent function in a framework of open dynamic systems. Open systems require interaction with their environments in order to continue to exist; they require continuing interchange of matter and energy. Complex open systems, especially complexities involving selections among alternatives for the activity of the system, can contribute toward or harm the continued existence of an open system via contributions toward or harm to the environmental conditions or system-environmental relationships upon which the existence of the system is dependent.

A too simple example is a flame, which contributes to the maintenance of the threshold temperature necessary for the flame's continued existence -- and thereby also to the maintenance of the supply of oxygen via convection of air. This example is too simple in that there are no selections on the part of the flame among alternative manners of contributing toward its self-maintenance.

Even the simplest living systems, however, manifest internal homeostasis maintained by selections among alternative processes in those systems. A bacterium's selections of tumbling or swimming, in "getting worse" and "getting better" environments respectively, is one such example (D. Campbell, 1974, 1990). Here, swimming will halt and tumbling will ensue if, for example, the bacterium is swimming down a sugar gradient, thus making things worse, but swimming will continue if swimming up a sugar gradient, thus making things better. The selections of tumbling or swimming constitute environmentally sensitive selections among alternative actions -- selections for those actions that contribute to the survival of the system in those respective environments.

The contributions of such selections toward system survival constitute functions of those selections in a sense of function that does not require any external observer to notice them. The natural reality of such functions is manifested in the continued existence, or failure of existence, of the system itself -- and of the natural, causal, consequences of that existence or dissolution. Such contributions of selections toward system existence grounds a naturalistic emergent notion of function, upon which a full framework of functional analysis can be developed -- in particular, a framework supporting the modeling of emergent interactive representation. A more detailed elaboration of this analysis of the emergence of function is presented elsewhere (Bickhard, 1993a).

The interactive model, then, connects deeply with dynamic systems approaches: open interactive dynamic systems are required for the naturalistic emergence of the critical phenomena of function and representation. The interactive model of the nature of such representation, in turn, requires representation in conditions of uncertainty in system-environment interactions, thus taking a stance in the representation versus non-representation debate concerning dynamic system approaches. The interactive position that representation is required in dynamic systems approaches, however, involves a rejection of the notions of representation that are held in common to both sides of this argument. Strictly, then, we end up with a rejection of the terms of the dispute and a claim to transcend that dispute.

The emphasis, however, should remain on the sense in which dynamic systems approaches are not only very powerful modeling tools, but are necessary for understanding function and representation in naturalistic terms. It is also worth re-emphasizing that the interactive dynamic systems approach is not only necessary for understanding representation, but that it has strong claims even in the heartland of the symbol manipulation approach -- language (Bickhard, 1980b, 1987, 1992a, 1992c, in press-a; Bickhard & Campbell, 1992; Campbell & Bickhard, 1992a) Interactive dynamic systems approaches are the frontier for future exploration of cognition.

---------------------------------------------------

Connectionism

Connectionism (Waltz & Feldman, 1988b) and Parallel Distributed Processing (Rumelhart & McClelland, 1986; McClelland & Rumelhart, 1986) are variants of an approach to cognitive phenomena that has, with good reason, stirred much excitement and controversy. For our purposes, the differences between the two variants are not material. The general distributed approach has a number of distinct differences from standard compositional or symbol manipulational approaches to cognitive phenomena, some of which constitute definite advances and strengths, and some of which may be relative weaknesses (Clark, 1993; Norman, 1986). We will explore both. The central point to be made, however, is that connectionism and PDP approaches are just as committed to, and limited by, encodingism as are compositional, or symbol manipulational, approaches (though in somewhat differing ways). The "representations," the symbols, of a connectionist system are just as empty for the system as are those of any standard data structure (Christiansen & Chater, 1992; Van Gulick, 1982); a connectionist approach does not escape the necessity of a user semantics.

OVERVIEW

A connectionist system is, first of all, a network of nodes. Various pairs of nodes are connected to each other by a directional and weighted path, and some nodes are connected directly to sources of input from the environment. For a given system, the topology of the connections among the nodes and from the environment is fixed. Activations are received along the connections from the environment, which activate the directly connected nodes, which in turn activate further nodes via the weighted paths among the nodes. A given node receives activations from all upstream nodes in accordance with the activation levels of those nodes and the weights of the paths of transmission involved, and acquires a degree of activation resulting from those inputs in accordance with various built-in rules. It then sends its own activation down further connections in accordance with their weights. Often the nodes are organized in layers, with each layer sending activation levels along connections to the next layer. Most commonly, there are three layers of nodes: an input layer, a "hidden" layer, and an output layer whose ultimate activation levels constitute the intended output of the net of nodes -- a three layer feed-forward net. Loops of connection-and-node paths are possible, however. Activation thus flows through the network, possibly with feedback loops, until, in some circumstances, a stable pattern of activation of the nodes is achieved.

A fixed connectionist system is, therefore, specified by three sorts of information. The first is the graph of the nodes and directed connections among them and from the environment. The second is the set of weights on those connections. The third is the rules by which the nodes determine their resultant activations given their input activations. Both the graph and the weights, together with the relevant rules for setting node activations from inputs, determine a space of possible patterns of activation of the nodes, and a dynamics of the possible changes in activation patterns, forming paths or trajectories through the space of possible activation patterns. In general, this space will be a space of vectors of possible node activation levels, and the dynamics will determine trajectories of possible movement through paths of activation vectors.

In important cases, those dynamics will determine a set of regions or points in that space of possible activation patterns in which the activation patterns will remain stable -- the trajectories of activation patterns that exist in or enter such a stable region do not exit. Furthermore, there will be a tendency for initial activation patterns, upon receipt of environmental inputs, to "settle" into those determinate stable points or regions. In other words, the areas of stability will each have their own "catch basin," "drainage basin," or "region of attraction" such that any patterns of activation in one of those basins will dynamically move through the space of possible such patterns into the corresponding stability. Ideally, the regions of attraction for the points of stability will correspond to desired categories of inputs; they will differentiate the space of possible inputs into those categories that yield one particular pattern of resultant stable activation versus another.

The directed graph of a particular connectionist network is, in general, fixed, as are the various rules of functioning of the system. That graph and those rules together with the set of weights determine the space and dynamics of the activation patterns of the system. They determine the trajectories of the activation patterns within the overall space of possible such patterns, and they determine, as an aspect of that dynamics, which regions of activation patterns, if any, are stable, and what the regions of attraction to those stabilities, the "catch basins," will be. They determine which input patterns will stabilize, if at all, at which points of stability.

As described thus far, a connectionist system is a differentiator of input activation patterns in terms of the resultant stable activation patterns. The excitement concerning connectionism and PDP derives from the fact that the weights of the system are themselves not fixed, and that it proves possible in some cases to adjust the weights according to well defined rules and error correction experiences so that particular desired differentiations of input patterns are obtained. Such adjustment of the weights is then interpreted as learning -- learning of the desired differentiations, categorizations, of the input patterns.

The space of the (vectors of) possible weights of a connectionist system is a second-order space of the dynamics of the system. It is the space in which the dynamics of the learning processes occur. Each point in this weight space determines its own corresponding dynamics (and stabilities, and differentiations) of the activation space -- that is, each point in the weight space determines an entire dynamics of the activation space. In effect, each point in the second order dynamic space of weights determines, and has associated with it, an entire first order dynamic space of activations.[8] The dynamics of this weight space, the second order space, in turn, are determined by the directed graph of the system together with the "learning" rules and experiences. Movement of the system in this second order "learning" space of weights, thus, constitutes movement in a space of possible activation dynamic spaces, each with its own particular dynamic attractors and attracting regions, and, therefore, with its own particular differentiating properties. Much attention is given to designing system graphs, learning rules, and tutoring experiences that can yield activation dynamic spaces with interesting "categorizations" of inputs. It is this possibility of learning, of the dynamic acquisition of designer specified categorizations of inputs, that has sparked so much excitement within and about the field.

STRENGTHS

The most important advantage of PDP "symbols" is that they are "emergent." They emerge as stabilities of attraction regions in the spaces of activation patterns and dynamics, and "correct" such stabilities and dynamics are generated by appropriate points in the space of possible connection weights. Such appropriate points in the weight space are sought by the "learning" rules in response to the instructional experiences. This possibility of emergence is explicitly not present in the typical "symbols by syntactic definition only" approach, in which the best that can be attained are new combinations of already present "symbols."

A second advantage that is claimed for the PDP approach is that the computations of the new activations of the nodes are logically parallel -- each new node value can, in principle, be computed at the same time as each of the other node values. This provides the power of parallel computation in that potentially large networks could engage in massive computation in relatively short elapsed time. Massive parallelism and relatively short computational sequences are taken to capture similar properties and powers of processes in the brain (Churchland, 1989).

A related advantage of the connectionist approach is the sense in which the differentiating stable patterns of activation are intrinsically distributed over the entire set of (output) nodes. This distributivity of "representation" originates in the same aspect of connectionist net architecture as does the parallelism of computation mentioned above -- each of the multiple activation nodes can in principle also be a node of parallel computation -- but the distributivity yields its own distinct advantages. These are argued to include: 1) as with the parallelism, the distributed nature of the "representations" is at least reminiscent of the apparently distributed nature of the brain, and 2) such distributivity is held to provide "graceful degradation" in which damage to the network, or to its connection weights, yields only a gradual degradation of the differentiating abilities of the network, instead of the catastrophic failure that would be expected from the typical computer program with damaged code. This is reminiscent of, and argued to be for reasons similar to, the gradual degradation of hologram images when the holograms are physically damaged.

The phase spaces of the PDP approach introduce a general approach to doing science that, even when abstracted away from the particulars of PDP, is extremely important, and all too rare in studies of mental phenomena. The spaces are spaces of the potentialities of the overall conditions of the network -- the space of possible activation vectors, in this case -- and, thereby, contain the possible dynamics of the system. Understanding the system, then, is constituted as understanding the relevant phase space and the dynamically possible trajectories of the system's functioning within that space. This is the almost universal approach in physics, but explicit consideration of spaces (or other organizations) of potentialities within which dynamics can be modeled is rare in psychology, AI, and related disciplines (Bickhard & D. Campbell, in preparation; van Gelder & Port, in press).

This phase space approach highlights several of the powerful characteristics of PDP models. The space of activation patterns of a PDP network is a space of the intrinsic dynamics of the system, not a space of (encoded) information that the system in some way makes use of. It is like an automaton in which the states form a smooth surface (differentiable manifold), the state transitions are continuous on that manifold, and the state transitions intrinsically move "downward" into local attraction basins in the overall manifold. The space, then, does not have to be searched in any of the usual senses -- the system dynamics intrinsically move toward associated differentiating regions of stability.

Viewing connectionist systems in terms of modeling their (weight space adjustable) intrinsic dynamics, instead of in terms of the classical programmed informational manipulations and usages, is an additional perspective on both their distributed and their parallel nature. Because the activation space is the space of the possibilities and possible dynamics of the entire system, and because nothing restricts those dynamics to any simply isolable subspaces (such as would be equivalent to changing just one symbol in a data structure), then any properties of that space, such as input-differentiating dynamically-attracting regions of stability, will necessarily be properties distributed over the whole system and dynamically parallel with respect to all "parts" of the system.

Such an overall system perspective could be taken on standard symbol manipulation systems, but it would not be necessary to do so in order to understand the relevant properties, and, in fact, it would lose the supposed information separated out into the "pieces," the "particles," of the system that are taken to be "symbols" -- the "symbols" would all be absorbed into the overall state of the system, and the distinction between program and representation would be lost. (Furthermore, the representations could not be recovered. Their internal functional properties could be recovered, say, in an equivalent register machine, but the "aboutness," the representational properties, would have to be re-defined by the user or designer). The most interesting properties of a PDP system, on the other hand, cannot be analyzed at any more particulate level.

The connectionist approach in terms of the phase space of the entire system also gives rise to several other possibilities. In particular, the sense in which the system dynamics intrinsically arrive at the relevant activation stabilities, the sense in which the system dynamics constitute the movement into the differentiating stabilities -- instead of the system dynamics being distinct from those differentiators and, therefore, the dynamics having to engage in a search for or construction of them -- provides for the possibility of several different interpretations of that dynamic "movement toward dynamic stability."

If the input pattern is construed as a subpattern of its corresponding overall stable pattern, then that system can be interpreted as being engaged in pattern completion. If the input pattern is interpreted as being a component pattern of the attracting stability, then the system can be interpreted as engaging in content addressing of the stable pattern. If the input pattern is interpreted as being one of a pair of joint patterns instantiated in the stable pattern, then the system can be interpreted as engaging in pattern association. Note that if the input pattern and the stable pattern, whether interpreted as subpattern-full pattern, component pattern-subsuming pattern, or paired pattern-joint pattern, were separate components of the system -- e.g., separate "symbols" or "symbol structures" -- then any such completions, content addressing, or associating will have to be accomplished via standard searches, manipulations, and look ups.

The power of the PDP approach here is precisely that the input patterns and the stable patterns are both simply points in the space of the system dynamics, and that the system dynamics intrinsically move into the attracting regions -- the system does not have to explore, reject, manipulate, draw inferences from, combine, or in any other way consider any alternative paths or dynamics, it simply "flows downhill," "relaxes," into the low point of the catch basin that it is already in. That PDP systems can apparently do so much with such a natural and simple principle of dynamics is still another of its appeals.

A further power of PDP differentiators relative to symbol manipulation encodings is that, in standard cases, since the dynamics of the system are bound to ultimately enter some attractor or another, the system will ultimately differentiate all possible input patterns. These differentiations have an intrinsically open aspect to them in that, although the system may be trained to "correctly" differentiate certain paradigmatic training input patterns, its response to novel patterns is not necessarily easily predicted from its classifications of trained patterns. Its differentiations will depend on particulars of its organization that have neither been specifically designed nor specifically trained. This yields an intrinsic power of generalization to the differentiations of a PDP network. Such generalizations are not intrinsic to standard symbols, and are difficult to program. Specific such generalizations may, of course, count against specific PDP systems on the ground that the observed system generalizations are not those of the learner which is intended to be modeled, e.g., Pinker's critique of a PDP past tense "learner" (Pinker, 1988). The general power of the potentiality for such generalizations, however, is a distinct advance in this respect over predefined encodings. The openness of such differentiations provides, in fact, some of the power of a selection function or of indexical representation, and, therefore, some of the power of variables and quantifiers with respect to issues of generalization (Agre, 1988; Bickhard & Campbell, 1992; Slater, 1988).

WEAKNESSES

The PDP approach, however, is not without its own weaknesses. Most fundamentally, the primary advantages of PDP systems are simultaneously the source of their primary weaknesses. On one hand, the emergent nature of connectionist differentiations transcends the combinatoric restrictions of standard symbol manipulation approaches. Any model that is restricted to combinations of any sort of atom, whether they be presumed representational atoms or any other kind, intrinsically cannot model the emergence of those atoms: combinations have to make use of already available atoms (Bickhard, 1991b). The classical approach, then, cannot capture the emergence of input pattern differentiators, while PDP approaches can.

On the other hand, while the combinatoricism of the standard approach is a fatal limitation for many purposes requiring fundamentally new, emergent, representations -- e.g., learning, creativity, etc. -- it has its own strengths, some of which the PDP approach arguably cannot capture (Pinker & Mehler, 1988; Fodor & Pylyshyn, 1988). In particular, the combinatoric atomism of the standard symbol manipulation approach allows precisely what its usual name implies: the manipulation of separable "symbols." This creates the possibility of (combinatoric) generativity, componentiality, high context and condition specificity of further system actions and constructions, a differentiation of representational functions from general system activity, a "lifting" of representational issues out of the basic flow of system activity into specialized subsystems, and so on. All of these can be of vital importance, if not necessary, to the modeling of various sorts of cognitive activity, and all of them are beyond the capabilities of connectionist approaches as understood today, or at least can be approximated only with inefficient and inflexible kludges. A restriction to combinatorics dooms a model to be unable to address the problem of the emergence of representations, but an inability to do combinatorics dooms a system to minimal representational processing.

There is, however, a conceivable "third way" for connectionism: the processing of connectionist "representations" that is in some way sensitive to "representational" constituents without there being any syntactic tokens for those constituents. This would be akin to the processing of logical inferences directly on the Gödel numbers of predicate calculus sentences -- without unpacking them into predicate calculus strings -- or some more general methodology of distributed representation (van Gelder, in press-b). Whether such a strategy is sufficiently powerful, however, remains to be seen -- but initial indications are quite interesting (Bechtel, 1993; Clark, 1993; Niklasson & van Gelder, 1994; Pollack, 1990; van Gelder, 1990; van Gelder & Port, 1994; see also Goschke & Koppelberg, 1991).[9] The space of possible "other ways" for connectionism to handle such problems has not been exhaustively explored (Clark, 1989, 1993; Bechtel, 1993).

In general, the freedom of combination of symbolic atoms that provides such apparent computational advantages to classical approaches does not logically require that the relevant "representations" be constructed out of such atoms: it only requires that the space of possible constructions have in some manner the relevantly independent dimensions of possible construction, and of possible influence on further processing. Invoking constructive atoms is one way to obtain this independence of constructive dimensions -- separate dimensions for each atom, or each possible location for an atom -- but the critical factor is the independence, not how it is implemented (Bickhard, 1992c). Again, however, it is not yet clear whether connectionist systems per se can make good on this.

Absent any such "third way," however, connectionist systems do have a distinct disadvantage with respect to the systemic constructions and manipulations of their "representations." Put simply, symbol manipulation approaches have no way to get new "representations" (atoms), while connectionist approaches have no way of doing much with the "representations" that they can create. Of course, in neither case is there any possibility of real representations, for the systems themselves.

There are other weaknesses that are of less importance to our purposes, even though they could constitute severe restrictions in their own right. Perhaps the most important one is that the learning rules and corresponding necessary tutoring experiences that have been explored so far tend to be highly artificial and inefficient (e.g., back-propagation, Rumelhart & McClelland, 1986; McClelland & Rumelhart, 1986; Rich & Knight, 1991). It is not clear that they will suffice for many practical problems, and it is clear that they are not similar to the way the brain functions. It is also clear that they cannot be applied in a naturalistic learning environment -- one without a deliberate tutor (or built-in equivalent; see the discussion of learning in passive systems above). (There are "learning" algorithms that do not involve tutors -- instead the system simply "relaxes" into one of several possible appropriate weight vectors -- but these require that all the necessary organization and specification of the appropriate possible weight vectors be built in from the beginning; this is even further from a general learning procedure.)

PDP research thus far has focused largely on idealized, simplified problems. Even in such a simplified domain, however, the learning processes have at times demonstrated impressive inefficiencies, requiring very large numbers of training experiences. This limitation of experience with PDP networks to simplified problems, plus the clear inefficiencies even at this level, combined with the general experience of standard symbol manipulation approaches that scaling up to more realistic problems is usually impossible, at least at a practical level, has yielded the criticism that PDP and connectionism too will find that, even when models work for toy problems, they do not scale to real problems (Rumelhart, 1989; Horgan & Tienson, 1988). This is a criticism in potentio, or in expectation, since the relevant experience with PDP systems is not yet at hand.

A limitation that will not usually be relevant for practical considerations, but is deeply relevant for ultimate programmatic aspirations, is that the network topology of a PDP system is fixed. From a practical design perspective, this is simply what would be expected. From a scientific perspective, however, concerning the purported evolutionary, the embryological, and, most mysteriously, the developmental and learning origins of such differentiators, this fixedness of the network topology is at best a severe incompleteness. There is no homunculus that could serve as a network designer in any of these constructive domains (see, however, Quartz, 1993).

ENCODINGISM

The deepest problem with PDP and connectionist approaches is simply that, in spite of their deep and powerful differences from symbol manipulation approaches, neither one can ever create genuine representations. They are both intrinsically committed to encodingism.

The encodingist commitments of standard approaches have been analyzed extensively in earlier chapters. The encodingist commitments of PDP approaches follow readily from their characterization above: PDP systems can generate novel systems of input pattern differentiators, but to take these differentiating activation patterns as representations of the differentiated input patterns is to take them as encodings.

Note that these encodings do not look like the "symbols" of the standard "symbol" manipulation approach, but they are encodings nevertheless in the fundamental sense that they are taken to be representations by virtue of their "known" correspondences with what is taken to be represented. Standard "symbols" are encodings, but not all encodings are standard "symbols" (Touretzky & Pomerleau, 1994; Vera & Simon, 1994). Both model representation as atemporal correspondences -- however much they might change over time and be manipulated over time -- with what is represented: the presumed representationality of the correspondences is not dependent on temporal properties or temporal extension (Shanon, 1993).

The PDP system has no epistemic relationship whatsoever with the categories of input patterns that its stable conditions can be seen to differentiate -- seen by the user or designer, not by the system. PDP systems do not typically interact with their differentiated environments (Cliff, 1991), and they perforce have no goals with respect to those environments. Their environmental differentiations, therefore, cannot serve any further selection functions within the system, and there would be no criteria of correctness or incorrectness even if there were some such further selections.

A major and somewhat ironic consequence of the fact that PDP systems are not interactive and do not have goals is that these deficiencies make it impossible for connectionist networks to make good on the promise of constituting real learning systems -- systems that learn from the environment, not just from an omniscient teacher with artificial access to an ad hoc weight manipulation procedure. The basic point is that, without output and goals, there is no way for the system to functionally -- internally -- recognize error, and, without error, there is no way to appropriately invoke any learning procedure. In a purely passive network, any inputs that might, from an observer perspective, be considered to be error-signals will be just more inputs in the general flow of inputs to the network -- will just be more of the pattern(s) to be "recognized." Even for back-propagation to work, there must be output to the teacher or tutor -- and for competitive "learning," which can occur without outputs, all the relevant information is predesigned into the competitive relationships within the network.

In other words, connectionist networks are caught in exactly the same skepticism-solipsism impossibility of learning that confounds any other encodingist system. No strictly passive system can generate internally functional error, and, therefore, no strictly passive system can learn. Furthermore, even an interactive system with goals, that therefore might be able to learn something, will not be legitimately understood to have genuine "first person" representations so long as representation is construed in epistemically passive terms -- as merely the product of input processing -- such that the interactions become based on the supposed already generated input encodings rather than the interactions being epistemically essential to the constitution of the representations. It is no accident that all "learning" that has been adduced requires designer-provided foreknowledge of what constitutes error with regard to the processing of the inputs, and generally also requires designer variation and selection constructions and designer-determined errors (or else already available designer foreknowledge of relevant design criteria) within the space of possible network topological designs to find one that "works."

This point connects with the interactive identification of representational content with indicated potential interactions and their internal outcomes -- connectionism simply provides a particular instance of the general issues regarding learning that were discussed earlier. It is only with respect to such strictly internal functional "expectations," such contents, that error for the system can be defined, and, therefore, only with respect to such contents that learning can occur, and, therefore, only out of such functional "expectations" that representation can emerge. Representation must be emergent out of some sort of functional relationships that are capable of being found in error by the system itself, and the only candidate for that is output-to-input potentialities, or, more generally, interactive potentialities. Representation must be constructable, whether by evolution or development or learning; construction requires error-for-the-system; and the possibility of error-for-the-system requires indications of interactive potentialities. In short, representation must be constructed out of, and emergent out of, indications of interactive potentialities.

In a passive network, however, or any passive system, any classification is just as good as any other. There is no, and can be no, error for a system with no outputs. It is only via the interactions of the deus ex machina of the back-propagation (or other "learning" system) with the omniscient teacher that connectionist nets have given any appearance of being capable of learning in the first place.

PDP systems are, in effect, models of the emergence of logical transducers: transducers of input categories into activation patterns. But the complexity and the emergent character of the "transduction" relationship in a PDP network does not alter the basic fact that the system itself does not know what has been "transduced," nor even that anything like transduction, or categorization, has occurred. All relevant representational information is in the user or designer, not in the system.

A significant step forward in this regard is found in Jordan & Rumelhart (1992). In particular, they generalize the connectionist architecture to an interactive architecture, with genuine input and output interaction with an environment, and a number of interesting capabilities for learning about that environment -- learning how to control interactions in that environment -- and their model contains goals. They have also built in the necessary topology for the system to be able to generalize from previous experience to new situations. However, the logical necessity of such interactivity is not addressed; the goal-directedness is an adjunct to the focus on learning to control interactions and interaction paths, and is therefore also not elaborated with respect to its logical necessity; the topology involved is generated by the use of continuous variables in the fixed architecture of the system, and does not allow, for example, the construction of new spaces with new topologies (Bickhard & Campbell, in preparation); and there is no consideration of the problem of representation. Given these unrecognized and unexplored potentials, we conclude, similarly to our position with respect to Brooks and his robots, that Jordan and Rumelhart have accomplished more than they realize.

The emergent character, the phase space dynamics, the distributivity and parallelism, are all genuine advances of the PDP approach, but they do not solve the basic problem of representation. They do not avoid the incoherence problem: basic "encoding" atoms in an encodingist approach, even if they are emergent in a connectionist system, still have no representational content for the system itself, and there is no way to provide such content within the bounds of the modeling assumptions. The emergent regularities of connection between differentiating activation patterns and the input categories that are differentiated are dynamic regularities of the non-representational, non-differentiator functioning of the overall system; they are not epistemic regularities nor epistemic connections. If factual regularities between inputs and resultant system conditions were all that were required for representation, then every instance of every physical law (not to mention chemical, biological, ecological, meteorological, physiological, and social laws, and so on -- and even accidental such factual regularities) would constitute an instance of representation, and representation would be ubiquitous throughout the universe. More than correspondence is required for representation, and correspondence is not only not sufficient for representation, it is not even necessary (e.g., "unicorn" or "Sherlock Holmes" or a hallucination). From the perspective of interactivism and its critique of encodingism, for all their differences and comparative strengths and weaknesses, PDP and connectionist approaches are still equally committed to an unviable encodingist perspective on the nature of representation.

CRITIQUING CONNECTIONISM AND AI LANGUAGE APPROACHES

Like connectionism, approaches to language are dominated by encodingist presuppositions. Our discussion of language models did not emphasize this encodingism as much as we have with regard to connectionism; a question arises concerning why. Approaches to modeling language have proceeded on the basis of encodingist presuppositions, and thereby, in our view, fail from the beginning. The development of the field, however, has manifested more and more insights concerning language that are convergent with the interactive view -- in spite of those encodingist presuppositions. Our discussion of language models, therefore, focused most on the major developments in this field, which do tend to convergence with interactivism, though we also point out the problematic assumptions involved.

Connectionism, on the other hand, does not just presuppose an encodingism, it makes fundamental claims to have solved representational problems that standard computer models can not solve. The claim to fame of connectionism rests on, among other things, its claims regarding the ability to learn new representations. The fact that those representations are still encodings, and are therefore subject to the encodingism critiques, is far more central to the way in which connectionism presents itself and is known than is the case for typical language models. It seems appropriate, then, to emphasize more this encodingism critique when assessing connectionism.

On the other hand, connectionism offers deeper similarities to the interactive approach -- but these similarities are in terms of the dynamic possibilities of recursive connectionist nets (see below), not in terms of the representational claims made by connectionism. Such dynamic possibilities are being explored, but no one in the connectionist field is at this point making a "connection" between such dynamic possibilities and the nature of representation. The closest to be found are some approaches proposing net models that learn dynamic interactions, but there is no representational claim at all in these, and some other approaches that want to make use of the net dynamics to capture dynamic processing of representations, but the alleged representations supposedly being processed in these are standard encodingist "representations."

We find the connectionist movement, then, to be a convergent ally in many respects -- particularly with respect to its explorations of parallelism, distributivity, and phase space dynamics -- though seriously misguided in its conceptions of representation. We explore below some of the most intriguing convergences in connectionist research. These convergences emphasize dynamics, and modify or eschew standard connectionist representational assumptions.

-------------------------------------------------------

THE CENTRAL NERVOUS SYSTEM

Oscillations and Modulations

An architecture supporting oscillations and modulations makes very strong connection with a number of aspects of brain functioning that both connectionist models and standard models ignore (Bickhard, 1991e). Consider first the oscillatory nature of neural functioning. Endogenous oscillations -- non-zero baselines -- are intrinsic to many neural processes at both the individual cell and network levels. That is, there is intrinsic ongoing oscillatory activity, not just in response to any other "outside" activity (Gallistel, 1980; Kalat, 1984; Kandel & Schwartz, 1985; Thatcher & John, 1977). This is incompatible with the strict reactivity of the switches or commands in classical symbol manipulation AI and with the strict reactivity of the nodes in connectionism. There is no architecturally local endogenous activity in these perspectives, only reactions to inputs. Furthermore, the potential temporal complexity of oscillatory processes stands in stark contrast to the simplicity and reactive temporal fixedness of switches or "levels of activation." Such endogenous oscillatory activity might seem to make sense for motor control (Gallistel, 1980), but should be irrelevant to cognition according to contemporary approaches. Yet oscillatory activity is ubiquitous in perception and in action (MacKay, 1987), and endogenous oscillatory activity is ubiquitous throughout the central nervous system.

Similarly, modulations among such oscillatory processes are ubiquitous, and there are multifarious such modulatory relationships. Oscillations of neurons or networks modulate those of other cells or networks (Dowling, 1992; Thatcher & John, 1977). Various messenger chemicals -- peptides, among others -- modulate the effects of other transmitters. In fact, such modulation is ubiquitous (Bloom & Lazerson, 1988; Cooper, Bloom, & Roth, 1986; Dowling, 1992; Hall, 1992; Siegelbaum & Tsien, 1985; Fuxe & Agnati, 1987). Furthermore, a significant population of neurons rarely or never "fire" (Bullock, 1981; Roberts & Bush, 1981). Instead, they propagate slow-wave graded ionic potentials. Such variations in potential will affect ionic concentrations in nearby extra-cellular spaces, or modulate the graded release of neurotransmitters, and, thus, effect the modulations of (modulatory) activity that are occurring via synaptic junctions -- a modulation of modulations (Bullock, 1981; Fuxe & Agnati, 1991b).

In fact, there is a broad class of non-synaptic modulatory influences (Adey, 1966; Fuster, 1989; Fuxe & Agnati, 1991a; Nedergaard, 1994; Vizi, 1984). A central example is that of volume transmitters, which affect the activity of local volumes (local groups) of neurons, not just the single neuron on the other side of a synapse (Agnati, Fuxe, Pich, Zoli, Zini, Benfenati, Härfstrand, & Goldstein, 1987; Fuxe & Agnati, 1991a; Hansson, 1991; Herkenham, 1991; Matteoli, Reetz, & De Camilli, 1991; Vizi, 1984). For example, the administration of L-dopa, a precursor of the transmitter dopamine, for Parkinson's disease makes little sense within the strict synaptic cleft model. There is a loss of neurons that normally produce dopamine, and L-dopa increases the dopamine production of those neurons that remain. If this increased production of dopamine affected only the neurons to which the remaining reduced number of dopamine producing neurons were synapsed, then it would simply hyperactivate -- hypermodulate -- the reduced neural network. Instead, the dopamine seems to act as a volume transmitter and increases and modulates the activity of whole local populations of neurons, and, therefore, their networks (Vizi, 1984; Changeux, 1991; Fuxe & Agnati, 1991b; Herkenham, 1991). Similarly, dopamine producing neural grafts can have positive effects even without specific innervations via general regulatory functions of elevated dopamine rather than patterned input (Changeux, 1991; Dunnett, Björklund, Stenevi, 1985; Herkenham, 1991).

Chemical Processing and Communication

One conceptual framework for understanding such influences, a framework that is broader than the classical threshold switching model of the neuron, derives from recognizing that synaptic neurotransmitters are strongly related to hormones (Acher, 1985; Emson, 1985; Scharrer, 1987; Vizi, 1984, 1991). Hormones may be viewed as general information transmitting molecules -- modulatory molecules -- that exhibit an evolution from intracellular messengers to neurohumors to neurohormones to endocrine hormones (Hille, 1987; Turner & Bagnara, 1976). A number of molecules, in fact, serve all such levels of function in varying parts of the body. In an important sense, paradigm neurotransmitters are "just" extremely local hormones -- local to the synaptic cleft -- that affect ion balance processes (among others). But not all of them are so localized. There appears to be a continuum ranging from the extreme synaptic localization of some transmitters to the whole body circulation and efficacy of some hormones, with local hormones and "volume" neurotransmitters in between.

Consider first the "slow" end of this continuum. Here we note that the larger volume effects are clearly also longer time scale modulations. In the brain, such slower and larger volume modulations can constitute modulations of the already ongoing modulations among neural oscillations and within neural networks, even to the point of reconfiguring the effective neural circuitry (Dowling, 1992; Hall, 1992; Iverson & Goodman, 1986; Koch & Poggio, 1987). Consider now the "fast" end of the spectrum. At this extreme of space and time considerations we find gap junctions which transmit electrical changes from cell to cell virtually instantaneously, with no mediating transmitters (Dowling, 1992; Hall, 1992; Nauta & Feirtag, 1986). Slow wave potential oscillations can also function via direct ionic influences, without mediating neurotransmitters, but with larger volume and longer time scale effects. In addition, these may control the graded release of transmitters rather than all-or-none release patterns (Bullock, 1981; Fuxe & Agnati, 1991a).

Modulatory "Computations"

It should be recognized, in fact, that insofar as the classical nerve impulse train serves primarily the function of transmitting the results of more local graded interactions (graded and volume transmitter and ionic interactions) over long distances (Shepard, 1981) -- that is, insofar as neural impulse trains, oscillations, are the carriers of the results of the local graded and volume processes -- it is these non-classical processes that prove crucial. We might also expect impulse oscillations to participate in such volume processes (Bullock, 1981). Neither the "switches" nor the "gates" of standard paradigms can make any sense of such processes. This perspective of multiple forms of modulation, multiple spatial characteristics and temporal characteristics of modulations, differential affinities, varying layers of modulation and meta-modulation, and so on, quickly becomes extremely complex, but it remains extremely natural in principle from the oscillatory-modulations perspective of the interactive architecture.

Oscillatory and modulatory phenomena form a major framework for the architectural organization of the central nervous system. In particular, the multiple sorts of modulatory relationships, with their wide variations in volume and temporal modulation effects, permeate the entire brain. Some systems propagate influences primarily via axonal impulses, e.g., the visual tract (Koch & Poggio, 1987), while others influence via slow wave potential movements. Local and volume transmitters seem to coexist throughout many, if not most, areas of the brain (Fuxe & Agnati, 1991a; Vizi, 1984). Some synapses have been found to release multiple chemicals that serve multiple levels of modulatory function from the same synapse. The key differentiations, again, are those of 1) time and volume and 2) meta-modulations on underlying modulations. Superimposed on them are the differentiations of modulatory influences even within the same local volume via differentiations in synaptic and volume transmitter molecules. Differential sensitivities to differing transmitters yields differential sensitivities to differing modulating influences even for neighboring, even for intertwined, neural networks (Koch & Poggio, 1987).

The Irrelevance of Standard Architectures

None of this makes any sense -- it is all utterly superfluous -- from standard contemporary perspectives. Neither classical symbol manipulation nor connectionist approaches provide any guidance in understanding such phenomena. This irrelevance of standard approaches has led to the recognition that a major conceptual shift is needed in order to be able to even begin to understand the functioning of the brain (Bullock, 1981; Freeman & Skarda, 1990; Pellionisz, 1991). A shift is required not merely from sequential to parallel processing, but from local processing to volume, or geometric, processing. These geometric architectural variations in kinds of modulatory influences available within the overall system -- with their range in both spatial and temporal characteristics and in transmitting chemicals -- are exactly what should be expected from within the interactive oscillatory-modulatory architectural perspective.

Standard approaches, both connectionist and symbol manipulation, simply have nothing to say about such phenomena. They can at best be interpreted away, in the familiar manner, as implementational issues -- beneath the important levels of functional analysis, which are supposedly captured by symbol manipulations or connectionist nets. We have argued that such architectures are not, and cannot be, mere issues of implementation. Such oscillatory and modulation phenomena are -- and logically must be -- the basic form of architecture for any viable, intelligent, or intentional system.

A Summary of the Argument

In summary of the architecture arguments to this point, we have:


* Contemporary conceptions of and approaches to representation are not only wrong, they are at root logically incoherent. They universally assume that representation is some form of correspondence (isomorphism) between a representing element or structure and the represented, but they do not and cannot account for how any such correspondence is supposed to provide representational content, "aboutness," for the animal or agent itself. They are universally analyses from the perspective of some observer of the animal or agent, and, therefore, are intrinsically incapable of accounting for the cognitive processes and capacities of such an observer per se.


* Representation is emergent from action, in roughly Piagetian or Peircean senses, rather than out of the processing of inputs, as in the dominant forms of contemporary Cognitive Science. Representation is constituted as organizations of indications of potentialities for interaction. Epistemic agents must be active and interactive; it is not possible for passive systems to be epistemic systems.


* Representation is constituted in several forms of implicit definition. The easy unboundedness of implicit definition relative to standard explicit and discrete conceptions of representational elements and structures, as in the frame problems, is another fundamental inadequacy of standard approaches and corresponding superiority of the interactive model.


* Action and interaction requires timing. This is not just "speed" since timing can be either too late or too early, but, rather coordinative timing of actions and interactions. This is in contrast, again, to standard Cognitive Science in which certain forms of correspondences are supposed to constitute representation, and such correspondences are logically atemporal in their representational nature, no matter that they are created, destroyed, and processed in time.


* Timing cannot be modeled within Turing machine theory, nor, therefore, within any model or architecture that is equivalent to Turing machine theory. Turing machine theory captures temporal sequences of operations in formal processes, but nothing about the steps in those sequences constrains their timing. The first step could take a year, the second a microsecond, the third a century, and so on without affecting the mathematical or formal properties of Turing machines. We are interested in our physical instantiations of Turing machines, computers, being as fast as possible, of course, but this is a practical concern of elapsed time, not a concern of timing.


* Timing requires clocks, and clocks are essentially oscillators.


* A single clock driving everything is not a viable architecture from an evolutionary perspective, since any changes in overall architecture would have to involve simultaneous changes in the clock connections in order to be viable. Such simultaneity, and myriads of instances of such simultaneity throughout the evolution of the nervous system, is, for all practical purposes, of measure zero.


* An alternative architecture is to use clocks -- oscillators -- as the basic functional unit, and to render functional relationships among them in terms of modulations among oscillatory processes.


* Any change in basic functional architecture in this framework is intrinsically also a change in the timing architecture -- the two are identical.


* This is at least as powerful as standard Turing machine architectures: a limiting case of the modulation relationship is to modulate On or Off -- that is, a switch is a trivial limiting case of modulation, and a switch is sufficient for the construction of a Turing machine.


* It is in fact more powerful than Turing machines in that timing is now intrinsic to the functional nature of the architecture and its processes.


* Modulation is more general than discrete functional relationships, such as switches, and can at best be asymptotically approximated by unbounded numbers of such discrete relationships.


* Oscillations and modulations as the basic forms of functioning are much closer to the actual processes in the central nervous system, and account for many properties that must remain at best matters of mere implementation from either standard symbol manipulation or connectionist perspectives.


* Examples of oscillatory and modulatory properties include:

1) The basic endogenous oscillatory properties of both single neurons and of neural circuits;

2) The large populations of neurons that never "fire," but that do modulate the activity of neighboring neural circuits;

3) The wide temporal and spatial range of variations in forms of modulations in the nervous system, such as gap junctions at the fast end and volume transmitters, that diffuse through intercellular fluid rather than across a synapse, at the slow end;

4) The deep functional, embryological, and evolutionary connections between the nervous system and the hormonal system.


* The nervous system is, in fact, exquisitely designed for multifarious forms of modulatory and meta-modulatory relationships among its oscillatory processes, and this is precisely what is shown to be necessary by the "representation emerges from interaction, which requires timing" argument.

Whereas both information processing and connectionist perspectives are used for modeling, and both make claims for the nature of, the architecture and functioning of the nervous system, neither of them can account for ubiquitous basic functional properties of the nervous system. The interactive model does account for those properties -- predicts them as being necessary, in fact -- and, thus, is a much richer and more powerful framework within which to explore and model the nervous system than any other framework currently available.

Interactivism, then, offers a candidate for the "major conceptual shift" that is needed for understanding of the functioning of the brain (e.g., Bullock, 1981; Freeman & Skarda, 1990; Pellionisz, 1991). Noting the ubiquitous and necessary oscillatory and modulatory functional relationships in the brain, however, does not constitute a model of brain functioning -- we have not mentioned anything about brain anatomy, for example -- but it does pose a set of architectural constraints on any adequate model of brain functioning, and a set of constraints that is not assimilable within symbol manipulation or connectionist approaches. The interactive model is the only model available that makes sense of what we know of the oscillatory and modulatory manner in which the nervous system actually works and, thus, provides guidance for further explorations.

[1] When did those encodings get inserted into the genes? And how could they have been inserted? Where did they come from? If they were somehow emergently constructed by evolution, then why is it impossible for learning and development in human beings to emergently -- non-combinatorically, non-onion-like -- construct them? If they cannot have been emergently constructed, then evolution is just as helpless as learning and development, and representations become impossible: they did not exist at the Big Bang, so they could not have emerged since then (Bickhard, 1991c, 1993a).

[2] For related issues, see the discussion of the Frame Problems below.

3 Lenat, Guha, Pittman, Pratt, & Shephard (1990), in contrast, claim only a chance at surpassing a "brittleness threshold" -- e.g., an inability to handle novel conditions -- in knowledge bases. The problem of brittleness is, in a familiar way, claimed to be a scale problem -- thus its construal in terms of a threshold -- with the CYC project as the attempt to surpass the scale threshold. Again there is little argument for such characterizations. Nevertheless, with no claims to be creating genuine cognition, this is a relatively cautious presentation of the project.

[4] There are technical niceties involved here concerning the exact boundary requirements for something to be a representation. The basic point, however, is that the domain of 1) functional indications of possible interactions, 2) functional indications of subsequent internal outcomes, 3) selections of interactions on the basis of such indications, 4) influence or control over subsequent process on the basis of the success or failure of such indication of internal outcome, and so on, is the domain of the emergence of interactive representation, and that such functional organizations are present in, and are easily compatible with, Brooks' robots (e.g., Mataric, 1991). In rejecting and disclaiming representation, Brooks is quite correct in standard senses, but is overlooking such interactive possibilities.

[5] Notice that Brooks' creatures constitute a concrete counterexample to the "all or none" presupposition about mental phenomena. They illustrate real representation in a form that is decidedly not at the level of human intelligence. This separation may be another reason why Brooks looks so radical to standard AI.

A recognition of primitive representation, of course, is inherent in any evolutionary perspective that recognizes any kinds of mental properties in simpler organisms. Brooks shares this with a number of other researchers that have a biological orientation, such as Krall (1992) or many people within the dynamic systems approaches.

[6] This recognition is already "implicit" in Brooks' discussion: He is writing from within the perspective of analysis by a designer, not from the system's perspective.

[7] This is the continuous phase space equivalent of state-splitting in finite state machines (Hartmanis & Stearns, 1966; Booth, 1968; Ginzburg, 1968; Brainerd & Landweber, 1974; Bickhard, 1980b; Bickhard & Richie, 1983). In either case, the functional consequences of any purported "representations" are absorbed into the dynamics of the system, leaving little left to "be" the alleged representations.

[8] A fiber bundle in differential geometry -- see Chapter 15.

[9] There are also interesting questions about the supposed naturalness with which classical symbol systems can capture the systematicity that human beings in fact manifest (van Gelder & Niklasson, 1994).