HERBERT SIMON, PAUL THAGARD, PAT LANGLEY AND OTHERS ON DISCOVERY SYSTEMS
BOOK VIII - Page 3
Human Problem Solving, Cognitive Psychology and Heuristics
Simon’s theory of human problem solving is his theory of procedural rationality, and it is elaborately set forth in his Human Problem Solving (1972) co-authored with Allen Newell. This nine hundred-page magnum opus took fourteen years to write. During this period Simon also wrote a briefer statement, Sciences of the Artificial (1969), and several articles since reprinted in his Models of Discovery (1977), an anthology of many of his previously published papers. Much of Human Problem Solving consists of detailed descriptions of problem-solving computer programs, none of which pertain to scientific discovery. Nonetheless his views on human problem solving are relevant to methodology of science, because he considers scientific discovery to be a special case of human problem solving.
At the outset of Human Problem Solving the two collaborating authors state that their aim is to advance understanding of how humans think by setting forth a theory of human problem solving. The concluding section of the book sets forth a general statement of their theory, which is based on the computer programs described in the body of the book and presented as empirical evidence relevant to their theory. They state that the specific opportunity that has set the course for their book is the development of a science of information processing. Their central thesis is that explanation of thinking can be accomplished by means of an information theory, and that their theory views a human as a processor of information, an information processing system. They say that such a description of the human is not just metaphorical, because an abstract concept has been developed of an information processor, a concept that abstracts from the distinctively mechanical aspects of the computer. The authors compare the explanations in information science to the use of differential equations in other sciences such as classical physics. An information theory consisting of computer programs is dynamic like differential equations, because it describes change in a system through time. Such a theory describes the time course of behavior, characterizing each new act as a function of the immediately preceding state of the system and its environment. Given at any time the memory contents characterizing the system’s state at that moment, the program determines how the memory contents will change during the next computing cycle and what the contents will be at the end of the cycle.
The fundamental methodological problems of theory construction and theory testing are the same in both the mathematical and computational types of theory. The theory is tested by providing a specific set of initial and boundary conditions for the system, by using the equations or computer program to predict the resulting time path, and then by comparing this predicted path with the actual path of the system. The advantage of an information-processing language over the mathematical languages for a theory of thinking is that an information processing language takes symbolic structures rather than numbers for the values of the variables.
The information theory about human thinking and problem solving is a theory in cognitive psychology. Newell and Simon note that their cognitive theory is concerned with performance, specifically with the performance of intelligent adults in our own culture, while psychologists have traditionally been more concerned with learning. In his autobiography as well as elsewhere Simon distinguishes cognitive psychology from both the gestalt and the behavioristic approaches to psychology. He rejects the black-box approach of the behaviorists and especially that of B.F. Skinner, who maintains that the black box is empty. Simon also rejects the reductionist version of behaviorism, according to which complex behavior must be explained in terms of neurological processes. And he furthermore rejects the neurological modeling approach of the psychologists who use parallel connectionist networks or neural nets for computerized explanations. Newell and Simon propose a theory of symbols located midway, as it were, between complex behavioral processes and neurological processes. Simon acknowledges a debt to the Gestaltists and their allies, who also recognize a layer of constructs between behavior and neurology, but Simon rejects the Gestaltists’ wholistic approach to these constructs. Simon proposes an explicitly mechanistic type of explanation of human thinking and problem solving in terms of information processing.
Simon defines human problem-solving thinking as a system of elementary information processes, organized hierarchically and executed serially, and consisting of procedures that exhibit large amounts of highly selective trial-and-error search based on rules of thumb or “heuristics”. Simon relies on the concept of hierarchy as a strategy for managing complexity. He defines a hierarchical system as one that is composed of interrelated subsystems, each of which in turn is hierarchical in structure down to a lowest level consisting of an elementary subsystem. In human problem solving hierarchy is determined by the organization of subgoals, which is the second idea that Simon said in his autobiography is basic to his entire scientific output. Hierarchical organization is common in computer systems. Applications programs are written in compiler and interpreter languages such as FORTRAN and LISP, and these languages in turn contain reserved words that are names for macros, which are subsystems in the compiler library, which in turn contain lower level subsystems, and so on down to a basic level consisting of elementary systems in binary code.
For the specifically problem-solving type of human thinking Simon has analyzed information processing into a few basic concepts. Firstly there is the “task environment”, by which he means the problem-solving processor’s outer environment. Secondly the task environment as viewed by the problem solver produces a “problem space”, together with the goal that orients the problem solver to his task environment. The problem space is the inner environment consisting of the processor’s internal representation of the outer task environment, and in which the problem solving activities take place. Simon maintains that there is no objective representation of the task environment independently of some processor’s problem space. Furthermore there is a task or goal that defines the “point of view” about the problem-solving processor’s outer environment, and that therefore defines the problem space. Simon calls this defining process an “input translation process.” Thirdly in addition to task environment and problem space, Simon introduces the concept of “method”. A “method” is a process that bears some “rational” relation to attaining a problem solution, as formulated and seen in terms of the internal representation, which is the problem space. Here the term “rational” is understood as satisficing in the sense that a satisfactory as opposed to an optimal solution is achieved. In the mechanical processor, the method is the computer program, and most of Simon’s theory of problem solving pertains to the method.
Simon distinguishes three types of method. The first type is the recognition method, which can be used when the solution is already in the processor’s memory, and artificial-intelligence systems using this method rely on large stores of specific information. Computer programs using this type of method contain a conditional form of statement, which Simon calls a “production”. In a production whenever the initial conditions are satisfied, the consequent action is taken. And when the conditions of several alternative productions are satisfied, the conflicts between them are resolved by priority rules. In his autobiography Simon notes that productions have become widely accepted to explain how human experts make their decisions by recognizing familiar cues directly, and that productions have been used for the “expert systems” in artificial intelligence. Experts, both human and mechanical, do much of their problem solving not by searching selectively, but simply by recognizing the relevant cues in situations similar to those experienced before. It is their wealth of experience that makes them experts.
The second type of method is what Simon calls the generate-and-test method. In this method the computer system generates a problem space, and has as its goal to find or to produce a member in a subspace identified as a solution by a test. The generality and weakness of this method lies in the fact that the generation and test procedures are independent, so that the amount of search is very large. Simon typically portrays this method as requiring a search that is so large, that it cannot be carried out completely, and so must proceed in a random manner. To address this problem of innumerable possibilities the pragmatist philosopher C.S. Peirce had advanced his logic of abduction, which postulates a natural light or instinctive genius for producing correct theories.
The third type of method is Simon’s theory of heuristics, which exploits the information in the task environment as that task environment is represented internally in the processor by the problem space. In the heuristic search, unlike the generate-and-test method, there is a dependence of the search process upon the nature of the object being sought in the problem space and the progress made toward it. This dependence functions as a feedback that guides the search process with controlling information acquired in the process of the search itself, as the search explores the internalized task environment. This method is much more efficient than the generate-and-test method, and Simon believes that it explains how complex problems are solved with both human and mechanical bounded rationality.
These three alternative methods represent different artificial-intelligence research programmes, software development v hardware development, which may also be characterized as knowledge v speed. The generate-and-test method is dependent on fast hardware; the heuristic-search method is dependent on efficient software design. Developments in hardware technology, as well as the magnitude of the problems they select affect researcher preferences for one or another of the methods. The hardware preference has been called the “brute force” approach, and as the technology has advanced, it has enabled the implementation of artificial-intelligence systems that offer little new software but greatly improved performance for the extensive searching of very large problem spaces. It has often been implemented in supercomputers.
For example the Wall Street Journal (30 April 1990) reported that a group of five Carnegie-Mellon University graduate students with IBM Corporation funding have developed a multiprocessor chess-playing system named “Deep Thought”, that exhibits grand-master performance with superhuman speed. It was reported that this system does not represent any noteworthy software development either in chess-playing search heuristics or in expert chess-playing strategies. Instead it explores the huge chess-playing problem space more quickly and extensively than can the human grand master, who is limited by human bounds to his rationality. Developments such as the quantum-computing technology promise to enable the combinatorial generate-and-test method with effectively minimal hardware constraint.
On Scientific Discovery and Philosophy of Science
Before Simon and his colleagues at Carnegie-Mellon University had developed functioning computerized discovery systems simulating historic scientific discoveries, Simon had written articles claiming that scientific discovery is a special case of human problem solving. In these articles he related his human problem-solving approach for discovery, to views published by various philosophers of science. The articles are reprinted in his Models of Discovery, where he insightfully comments in his “Introduction” that dense mists of romanticism and downright knownothingness have surrounded the subject of scientific discovery and of creativity. And in his “Scientific Discovery and the Psychology of Problem Solving” (1966) Simon states his thesis that scientific discovery is a form of problem solving, i.e., that the processes whereby science is carried on can be explained in terms that he used to explain the processes of problem solving. Problem-solving thinking uses a collection of elementary information processes organized hierarchically and executed serially, and consists of processes that exhibit large amounts of highly selective trial-and-error search based on rules of thumb or heuristics. The theory of scientific discovery is a system with these characteristics, and which behaves like a scientist.
Superior problem-solving scientists have more powerful heuristics, and therefore produce either adequate solutions with less search or better solutions with equivalent search, as compared with less competent scientists. Science is satisficing, and to explain scientific discovery is to describe a set of processes that is sufficient to account for the degrees and directions of scientific progress that have actually occurred. Furthermore, for every great success in scientific discovery there are many failures. Curiously Simon also says that a theory explaining scientific discovery must predict innumerable failures for every success.
In this same 1966 article Simon also takes occasion to criticize the philosophy-of-science literature. He maintains that the philosophy literature tends to address the normative rather than the descriptive aspects of scientific methodology, and that philosophers are more concerned with how scientists ought to proceed to conform to certain conceptions of logic than with how scientists do in fact proceed. And he adds that their notions of how scientists ought to proceed have focused primarily on the problem of induction. He concludes that the philosophy-of-science literature has little relevance to the actual behavior of scientists, and has less normative value than has been supposed.
But he finds two exceptions in the philosophy of science literature: Norwood Russell Hanson and Thomas S. Kuhn. He says that both of these authors have made significant contributions to the psychology and sociology of scientific discovery, and that they have been quite explicit in distinguishing the process of discovery from philosophers’ traditional canons of “sound” scientific method. He also says that he has made much use of the views of both of these philosophers. Simon’s principal commentary on the philosophy of Hanson is his defense of Hanson against the view of Popper in “Does Scientific Discovery Have a Logic?” (1973). He notes that Popper rejects the existence of a logic of scientific discovery in Popper’s ironically titled Logic of Scientific Discovery (1934), and he says that Popper’s view is opposed by Hanson in the latter’s Patterns of Discovery (1958) and is also opposed by Peirce. Peirce used the term “abduction”, which Simon says is the main subject of the theory of problem solving in both its normative and positive forms. Simon observes that Hanson made his case by historical examples of scientific discovery, and that he placed great emphasis on discovery of perceptual patterns.
In this 1973 article as well as in his earlier “The Logic of Rational Decision” (1965) Simon distinguishes heuristic search from induction, and defends the idea of a logic of scientific discovery in the sense that norms can be derived from the goals of scientific activity. Simon defines the logic of scientific discovery as a set of normative standards for judging the processes used to discover or test scientific theories, where the goal from which the norms are derived is that of discovering valid scientific laws. Simon emphasizes that the heuristic strategy does not guarantee success. He states that discovering a pattern does not involve induction or extrapolation. Induction arises only if one wishes to predict and to test whether or not the same pattern will continue to obtain when it is extrapolated. Law discovery only means finding patterns in the data that have been observed; whether or not the pattern will continue to hold for new data that are observed subsequently will be decided in the course of predictive testing of the law, and not in discovering it. He therefore argues that he has not smuggled any philosophical induction axiom into his formulation of a logic of discovery, and that such a logic does not depend on the solution of the problem of induction. It may be noted that after Simon’s colleagues had created functioning discovery systems based on heuristic search, Simon often described some of those systems as using “inductive search”. However, in his Scientific Discovery he explicitly rejects the search for certainty associated with attempts to justify inductivism. He subscribes to Popper’s falsificationist thesis of criticism.
Simon’s comments on Kuhn’s philosophy are principally concerned with Kuhn’s distinction between normal and revolutionary science. Kuhn maintained that the revolutionary transition is a gestalt switch, while Simon defends his own view that heuristic-search procedures apply to revolutionary changes as well as to normal science. In his “Scientific Discovery and the Psychology of Problem Solving” Simon says that his theory of scientific discovery rests on the hypothesis that there are no qualitative differences between the processes of revolutionary science and those of normal science, between work of high creativity and journeyman work respectively. Simon points to the fact that trial and error occurs in both types of work. He argues that trial and error are most prominent in those areas of problem solving where the heuristics are least powerful, that is, are least adequate to narrow down the problem space, such that the paths of thought leading to discoveries often regarded as creative might be expected to provide even more visible evidence of trial and error than those leading to relatively routine discoveries. Later in his Scientific Discovery Simon develops the idea of the amount of trial-and-error search into the distinction between “strong” methods, which he says resemble normal science, and “weak” methods, which resemble revolutionary science. He identifies expert systems based principally on productions, where there may be almost no search needed for problem solving, as paradigmatic cases of strong methods exemplifying normal science. Simon’s argument that trial and error is used in all types of discovery is his defense of the heuristic method.
But method is only one aspect of his theory of problem solving; there is also the definition of the problem space. He acknowledges that scientific work involves not only solving problems but also posing them, that correct question asking is as important as correct question answering. And he notes that Kuhn’s distinction between normal and revolutionary science is relevant to the relation of question asking to question answering. In the 1966 article Simon identifies the problem space, which is the problem solver’s point of view of the outer environment, with Kuhn’s idea of paradigm, and he identifies defining the problem space with the process of problem formation. Firstly Simon notes that normal science need not pose its own questions, because its questions have already been formulated for it by the current paradigm produced by the most recent scientific revolution. The problem space is thus given by the current state of the science. The problematic case is the scientific revolution, which establishes the new paradigm. Simon argues that it is not necessary to adduce entirely new mechanisms to account for problem formulation in revolutionary science, because, as Kuhn says, the paradigms of any given revolution arise out of the normal science of the previous period. Normal-science research leads to the discovery of anomalies, which are new problems that the prospective revolutionaries address.
Therefore Simon argues that there is no need for a separate theory of problem formulation for scientific revolutions. He states that a theory of scientific discovery adequate to explain revolutionary as well as normal science must account not only for the origin of the problems, but also for the origins of representations, namely the problem spaces or paradigms. Representations arise by modification and development of previous representations, as problems arise by modification and development of previous problems. A system intended to explain human problem solving and scientific discovery need not incorporate a highly powerful mechanism for inventing completely novel representations. Even in revolutionary science the problems and representations are rooted in the past, and are not cut out of whole cloth.
Later in his “Ramsey Eliminibility and the Testability of Scientific Theories” (1973) reprinted in his Models of Discovery Simon considers another objection pertaining to the problem space in revolutionary developments. The objection is that in revolutionary science the range of alternative hypotheses that constitute the problem space or representation cannot be delimited in advance. He states that this objection rests on a commonly drawn distinction between well defined problems, which are amenable to orderly analysis such as those in normal science, and ill-defined problems, which are thought to be the exclusive domain of creativity, such as those in revolutionary science. Simon argues that the force of the objection depends on the distinctions being qualitative and not just matters of degree. He replies that there is no reason to deny that revolutionary hypotheses can be the result of some kind of recursively applicable rule of generation. He cites as an example of a revolutionary discovery Mendeleev’s periodic table, which does not involve a notion of pattern more complex than that required to handle patterned letter sequences. The problem space of possible patterns in which Mendeleev was searching was of modest size, and at least half a dozen of Mendeleev’s contemporaries had noticed the pattern independently of him, although they had not exploited it as systematically or as vigorously as he did. Simon concludes that before one accepts the hypothesis that revolutionary science is not subject to laws of effective search, one should await more microscopic studies than have generally been made to date of the histories of revolutionary discoveries.
Later in “Artificial Intelligence Research Strategies in the Light of AI Models of Scientific Discovery” in Proceedings of the Sixth International Joint Conference on Artificial Intelligence (1979) Simon can refer to operational discovery systems. He states that discovery systems are distinguished from most other problem-solving systems in the vagueness of the tasks presented to them and of the heuristic criteria that guide the search and account for selectivity. And he adds that because their goals are very general, it is unusual to use means-end analysis commonly used for well structured tasks and to work backward from a desired result. The discovery system solves ill-structured tasks and works forward from the givens of the problem and then from the new concepts and variables generated from the givens. He does not reference Kuhn in this context, but the implication is that discovery systems can routinely produce revolutionary science. Then still later in his Scientific Discovery (1987) he reconsiders his earlier correlation of well structured problems with normal science and ill-structured problems with revolutionary science. He notes that normal science is described by Kuhn as involving no development of new laws but simply of applying known laws or developing subsidiary laws that fill in the dominant paradigm. He concludes that all discovery systems that develop new laws directly from data and not from a dominant paradigm must be productive of revolutionary science.
Simon’s difficulties in relating his ideas to Kuhn’s originate with Kuhn’s ideas, not with Simon’s. The most frequent criticism of Kuhn’s Structures of Scientific Revolutions in the philosophy of science literature is that his distinction between normal and revolutionary science is so vague, that with the exception of a few paradigmatic cases his readers could not apply the distinction to specific episodes in the history of science, unless Kuhn himself had identified a particular episode as revolutionary. The attractiveness of Kuhn’s book at the time of its appearance was not its unimpressive conceptual clarity; it was its welcome redirection of the philosophy profession’s interest to the history of science at a time when many philosophers of science had come to regard the logical positivist philosophy with hardly any less cynicism than Ovid had shown toward the ancient Greek and Roman pagan religion in his Metamorphoses. Simon’s discovery systems offer analytical clarity that Kuhn could not provide, with or without the Olympian irrelevance of the Russellian symbolic logic used by the logical positivists.
Nonetheless Simon’s psychological approach shares difficulties with Kuhn’s sociological approach. Philosophers’ reaction against Kuhn’s sociological approach was often due to the recognition that conformity to and deviance from a consensus paradigm may explain the behavior of scientists without explaining the success of science. Turn next to the discovery systems developed by Simon and his colleagues at Carnegie-Mellon University.
The Theory of Discovery Systems
Simon’s principal work on discovery systems for science is his Scientific Discovery: Computational Explorations of the Creative Processes (1987) co-authored with several colleagues including notably Pat Langley. Simon is the editor of the book. In the introductory section he says that the central hypothesis of the theory of scientific discovery is that the mechanisms of scientific discovery are not peculiar to that activity, but can be subsumed as special cases of the general mechanisms of problem solving. The theory of scientific discovery is therefore a theory in cognitive psychology. Simon seeks to investigate the psychology of discovery processes, and to provide an empirically tested theory of the information-processing mechanisms that are implicated in that process. The book exhibits a series of computer systems capable of making nontrivial scientific discoveries, which are actually replicated discoveries of historic scientific laws and theories including but not limited to empirical generalizations. The computer systems described in his book incorporate heuristic-search procedures to perform the kinds of selective processes that he believes scientists use to guide them in their search for regularities in data.
Simon states that an empirical test of the systems as psychological theories of human discovery processes would involve presenting the computer programs and some human subjects with identical problems, and then comparing their behaviors. The computer system can generate a “trace” of its operations, and the human subjects can report a verbal and written protocol of their behavior, while they are solving the same problem. Then the system can be tested as a psychological theory of cognitive behavior by comparing the trace with the protocol. But Simon also admits that his book supplies no detailed comparisons with human performance. And in discussions of particular applications involving particular discoveries, he notes that in some cases the historical discoveries were actually performed differently than his systems performed the rediscoveries. The interest in this book therefore is actually system design and performance rather than psychological testing and reporting.
Simon states that he wishes to provide some foundations for a normative theory of discovery, which is to say, to write a how-to-make-discoveries book. He explains that by this he does not mean a set of rules for deriving theories conclusively from observations. Instead, he wishes to propose a set of criteria for judging the efficacy and efficiency of the processes used to discover scientific theories. Accordingly Simon sets forth a satisficing rationality postulate for the scientist: to use the best means he has available – the best heuristics – for narrowing the search down to manageable proportions, even though this effort may result in excluding some good solution candidates. If the novelty of the scientific problem requires much search, this large amount of search is rational if it employs all the heuristics that are known to be applicable to the domain of the problem. Thus, his rationality postulate for the scientist is a bounded-rationality postulate, not only due to the limits imposed by the computer’s memory capacity and computational speed, but also due to the limit imposed by the inventory of available heuristics.