HERBERT SIMON, PAUL THAGARD, PAT LANGLEY AND OTHERS ON DISCOVERY SYSTEMS
BOOK VIII - Page 4
Langley’s BACON and Other Discovery Systems
Pat Langley is presently Professor of Computer Science at the University of Auckland, New Zealand, Director for the Institute for the Study of Learning and Expertise as Professor of Computing and Informatics, and Head of the Computing Learning Laboratory at Arizona State University. He is also Consulting Professor of Symbolic Systems and Computational Mathematics and Engineering at Stanford University. In his web site he reports that his research interests revolve around computational learning and discovery and especially their rôle in constructing scientific models.
In his Novum Organon (Book I, Ch. LXI) Francis Bacon had expressed the view that with a few easily learned rules or method it may be possible for anyone undertaking scientific research to be successful. And he proposed a method of discovery in the sciences, which will leave little to the sharpness and strength of men’s wits, but will instead bring all wits and intellects nearly to a level. For as in drawing a straight line or in inscribing an accurate circle by the unassisted hand, much depends on its steadiness and practice, but if a rule or pair of compasses be applied, little or nothing depends upon skill, so exactly is it with his method. Computer discovery systems do not quite warrant Bacon’s optimism, but they are a huge improvement over inexplicable and mysterious intuition so dear to romantics. Today Bacon’s agenda is called proceduralization for mechanization, and it is appropriate therefore that Pat Langley’s early and successful discovery system should be named BACON.
The BACON discovery system is a set of successive and increasingly sophisticated discovery systems that make quantitative theories from data. Given sets of observation measurements for several variables, BACON searches for functional relations among the variables. The search heuristics in earlier versions of each BACON computer program are carried forward into later ones, and the later versions contain new heuristics that are more sophisticated than those in earlier versions. In the literature describing the BACON systems each successive version is identified by a numerical suffix, such as BACON.1. The original version, BACON.1, was designed and implemented by Langley in 1979 for his Ph.D. dissertation written in the Carnegie-Mellon department of psychology under the direction of Simon. The dissertation is titled Descriptive Discovery Processes: Experiments in Baconian Science. Langley published descriptions of the system in “Bacon.1: A General Discovery System” in The Proceedings of the Second National Conference of the Canadian Society for Computational Studies in Intelligence (1978) and as a co-author in Simon’s Scientific Discovery (1987).
The BACON programs are implemented in a list-processing computer language called LISP, and its discovery heuristics are implemented in a production-system language called PRISM. The system internally lists the observable measurement data monotonically according to the values of one of the variables, and then determines whether the values of some other variables follow the same (or the inverse) ordering. Picking one of these other variables it searches for an invariant by considering the ratio (or the product) of these variables with the original one. If the ratio or product is not constant, it is introduced as a new variable, and the process repeats the search for invariants. Examples of some of the simpler search heuristics expressed in the conditional form of a production are as follows: (1) If the values of a variable are constant, then infer that the variable always has that value. (2) If the values of two numerical variables increase together, then examine their ratio. (3) If the values of one variable increase as those of another decrease, then examine their product. The general strategy used with these heuristics is to create variables that are ratios or products, and then to treat them as data from which still other terms are created, until a constant is identified by the first heuristic.
BACON.1 has replicated the discoveries of several historically significant empirical laws including Boyle’s law of gases, Kepler’s third planetary law, Galileo’s law of motion of objects on inclined planes, and Ohm’s law of electrical current. A later version named BACON.3 has rediscovered Coulomb’s law of electrical current. For making these discovery replications Simon and his associates used measurement data actually used by the original discoverers. His book references W.F. Magie’s A Source Book in Physics (1935).
BACON.4 is a significant improvement over earlier versions. It was developed and firstly described by Gary Bradshaw, Pat Langley, and Herbert Simon in “The Discovery of Intrinsic Properties” in The Proceedings of the Third National Conference of the Canadian Society for Computational Studies in Intelligence (1980), and it is also described in their 1987 book Scientific Discovery. The improvement is the ability to use nominal or symbolic variables that take only names or labels as values. For example the nominal variable “material” may take on values such as “lead”, “silver”, or “water.” Values for numerical properties may be associated with the values of the nominal variables, such as the density of lead, which is 13.34 grams per cubic centimeter. BACON.4 has heuristics for discovering laws involving nominal variables by postulating associated values called “intrinsic properties”, firstly by inferring a set of numerical values for the intrinsic properties for each of the postulated nominal values, and then by retrieving the numerical values when applying its numerical heuristics to discover new laws involving these nominal variables.
The discoveries of laws replicated by BACON.4 include: (1) Ohm’s law of electrical circuits, where the intrinsic properties are voltage and resistance, (2) Archimedes law of displacement, where the intrinsic properties are density and the volume of an irregular object, (3) Black’s law of specific heat, where specific heat is the intrinsic property, (4) Newton’s law of gravitation, where gravitational mass is the intrinsic property, and (5) the law of conservation of momentum, where the inertial mass of objects is the intrinsic property. BACON.4 was further enhanced so that it could rediscover the laws describing chemical reactions formulated by Dalton, Gay-Lussac, and Comizzaro. For example it replicated discovery of Gay-Lussac’s principle that the relative densities of elements in their gaseous form are proportionate to their corresponding molecular weights. Replicating discovery of these laws in quantitative chemistry involved more than postulating intrinsic properties and noting recurring values. These chemists found that a set of values could be expressed as small integer multiples of one another. This procedure required a new heuristic that finds common divisors. A common divisor is a number which, when divided into a set of values, generates a set of integers. BACON.4 uses this method of finding common divisors, whenever a new set of dependent values is assigned to an intrinsic property.
BACON.5 is the next noteworthy improvement. It uses analogical reasoning for scientific discovery. BACON.1 through BACON.4 are driven by data in search for regularities in measurement data. Furthermore the heuristics in these previous BACON systems are almost entirely free from theoretical presuppositions about domains from which the data are drawn. BACON.5 incorporates a heuristic for reducing the amount of search for laws, where the system is given very general theoretical postulates. Then it reasons analogically by postulating symmetries between the unknown law and a theoretical postulate given to the system as input. The general theoretical postulate given to BACON.5 is the law of conservation. The laws rediscovered by BACON.5 using analogy with the conservation law include the law of conservation of momentum, Black’s law of specific heat, and Joule’s law of energy conservation.
The BACON discovery system was not the first system developed around Simon’s principles of human problem solving with heuristics. In 1976 Douglas B. Lenat, presently CEO of Cycorp, Inc. of Austin Texas, published his Ph.D. dissertation titled AM: An Artificial Intelligence Approach to Discovery Mathematics as Heuristic Search written at Stanford University. Allen Newell was one of his dissertation advisors, and Lenat acknowledges that he got his ideas from Herbert Simon. Lenat has since accepted a faculty position in the computer science department of Carnegie-Mellon University.
In 1977 he published “The Ubiquity of Discovery” in The Proceedings of the Fifth International Joint Conference on Artificial Intelligence, (IJCAI) in which he describes Simon’s theory of heuristic problem solving in science and the specific heuristics in his AM discovery system. While Lenat’s article includes discussion of artificial intelligence in empirical science, his AM computer system is not for empirical science, but develops new mathematical concepts and conjectures with the heuristic strategy. He also published “Automated Theory Formation in Mathematics” in the 1977 IJCAI Proceedings. This paper offers a more detailed description of the system’s two-hundred fifty heuristics, and also discusses his application of the AM system in elementary mathematics. He reports that in one hour of processing time AM rediscovered hundreds of common mathematical concepts including singleton sets, natural numbers, arithmetic, and unique factorization.
In 1979 Simon published “Artificial Intelligence Research Strategies in the Light of AI Models of Scientific Discovery” in The Proceedings of the Sixth International Joint Conference on Artificial Intelligence. In this paper he considers Lenat’s AM system and Langley’s BACON systems as useful for illuminating the history of the discovery process in the domain of artificial intelligence (AI) itself, and for providing some insight into the ways to proceed in future research and development aimed at new discoveries in that field. He says that AI will proceed as an empirical inquiry rather than as a theoretically deductive one, and that principles for the discipline will be inferred from the computer programs constituting the discovery systems. Interestingly he notes that in a scientific profession the community members’ work is in parallel, whereas in the machines the work proceeds serially.
BACON created quantitative empirical laws by examination of measurement data. Simon and his associates also designed and implemented discovery systems that are capable of creating qualitative laws from empirical data. Three such systems named GLAUBER, STAHL and DALTON are described in Scientific Discovery (1987). The GLAUBER discovery system developed by Langley in 1983 is named after the eighteenth century chemist, Johann Rudolph Glauber, who contributed to the development of the acid-base theory. For its historical reconstruction of the acid-base theory GLAUBER was given facts known to eighteenth century chemists, before they formulated the theory of acids and bases. These facts consist of information about the tastes of various substances and the reactions in which they take part. The tastes are “sour”, “bitter”, and “salty.” The substances are “acids”, “alkalis” and “salts” labeled with common names, which for purposes of convenience are the contemporary chemical names of these substances, but GLAUBER makes no use of the analytical information in the modern chemical symbols. Associated with these common names for chemical substances are argument names, such as “input” and “output” that describe the rôles of the chemical substances in the chemical reactions in which the substances partake. Finally the system is given names for the three abstract classes: “acid”, “alkali”, and “salt.” When the system is executed with these inputs, it examines the chemical substances and their reactions, and then correlates the tastes to the abstract classes, and describes the reactions in a general law that states that acids and alkalis react to produce salts.
The second discovery system is STAHL developed by Jan Zytkow. From 1982 to 1984 he was a visiting professor at Carnegie-Mellon and worked with Simon and Langley. STAHL creates a type of qualitative law that Simon calls “componential”, because it describes the hidden structural components of substances. System STAHL is named after the German chemist, Georg Ernst Stahl, who developed the phlogiston theory of combustion. STAHL replicates the development of both the phlogiston and the oxygen theories of combustion. Simon states that discovery systems should be able to arrive at laws that have later been rejected in favor of newer theories in the history of science. And he says that since a discovery system’s historical reconstruction aims at grasping the main currents of reasoning in a given epoch, then reproducing the errors that were typical of that epoch is diagnostic. Like GLAUBER, STAHL accepts qualitative facts as inputs, and generates qualitative statements as outputs. The input is a list of chemical reactions, and its initial state consists of a set of chemical substances and their reactions represented by common names and argument names, as they are in GLAUBER.
When executed, the system generates a list of chemical elements and of the compounds in which the elements are components. The intermediate states of STAHL’s computation consist of transformed versions of initial reactions and of inferences about the components of some of the substances. When the system begins running, it is driven by data, but after it has made conjectures about the hidden structures, it is also driven by these conjectures, which is to say, by theory. Simon concludes from the rediscovery of the phlogiston and oxygen theories by STAHL, that the proponents of the two theories reasoned in essentially the same ways, and that they differed mainly in their assumptions. He also applied STAHL to the rediscovery of Black’s analysis of magnesia alba, and he says that the same principles of inference were widely used by chemists in their search for componential explanations of chemical substances and their reactions. Thus he claims that the procedures in STAHL are not ad hoc, and that STAHL is a general system.
The third discovery system that creates qualitative laws is DALTON, which is named after John Dalton. Like Dalton the chemist, the DALTON system does not invent the atomic theory of matter; it employs a representation that embodies the hypothesis, and that incorporates the distinction between atoms and molecules invented earlier by Amadeo Avogado. DALTON is a theory-driven system for reaching the conclusions about atomic weights that BACON.4 derived in a data-driven manner. And DALTON creates structural laws in contrast to STAHL, which creates componential laws. DALTON is given information that is similar to what was available to chemists in 1800. The input includes a set of reactions and knowledge of the components of the chemical substances involved in each reaction. This is the type of information outputted by STAHL, and DALTON uses the same common-name/argument-name scheme of representation used by STAHL. DALTON is also told which of the substances are elements having no components other than themselves. And it knows that the number of molecules in each chemical substance is important in the simplest form of a reaction, and that the number of atoms of each element in a given molecule is also important. DALTON’s goal is to use this input to develop a structural model for each reaction and for each of the substances involved in each reaction, subject to two constraints. The first constraint is that the model of a molecule of a substance must be the same for all reactions in which it is present. The second constraint is that the models of the reactions display the conservation of particles. Simon applied DALTON to the reaction involving the combination of hydrogen and oxygen to form water, and the system outputted a model giving a modern account of the reaction.
Since the publication of Scientific Discovery Simon and his associates have continued their work on discovery systems and have pursued their work in new directions. While BACON and the other systems described in the 1987 book are concerned mainly with the ways in which theories can be generated from empirical data, the question of where the data come from has largely been left unanswered. In “The Process of Scientific Discovery: The Strategy of Experimentation” (1988) in Models of Thought Simon and Deepak Kulkarni describe their KEKADA discovery system, which examines not only the process of hypothesis formation, but also the process of designing experiments and programs of observation. The KEKADA discovery system is constructed to simulate the sequence of experiments carried out by Hans Krebs and his colleague, Kurt Henseleit, between July 1931 and April 1932, which produced the elucidation of the chemical pathways for synthesis of urea in the liver. This discovery of the ornithine cycle was the first demonstration of the existence of a cycle in metabolic biochemistry. Simon and Kulkarni’s source for this episode is “Hans Krebs and the Discovery of the Ornithine Cycle” in Federation Proceedings (1980) by Frederic L. Holmes of Yale University. Holmes also made himself available to Simon and Kulkarni for consultation in 1986 when their study was in progress.
The organization of KEKADA is based on a two-space model of learning proposed earlier by Simon and Lea in “Problem Solving and Rule Induction: A Unified View” in Knowledge and Cognition (1974). The system searches in an “instance space” and a “rule space”, each having its own set of heuristics. The instance space is defined by the possible experiments and experimental outcomes, and it is searched by performing experiments. The rule space is defined by the hypotheses and other higher level descriptions coupled with associated measures of confidence. The system proceeds through cycles in which it chooses an experiment from the instance space on the basis of the current state of the rule space, and the outcome modifies the hypotheses and confidences in the rule space.
One of the distinctive characteristics of KEKADA is its ability to react to surprising experimental outcomes, and to attempt in response to explain the puzzling phenomenon. Prior to carrying out any experiment, expectations are formed by expectations setters, which are a type of heuristic for searching the rule space, and the expectations are associated with the experiment. The expectations consist of expected output substances of a reaction, and expected upper and lower bounds on the quantities or the rates of the outputs. If the result of the experiment violates these bounds, it is noted as a surprise. Comparison of the course of the work of Krebs as described by Holmes and of the work of KEKADA in its simulation of the discovery reveals only minor differences, which Simon and Kulkarni say can be explained by shifts in the focus of attention and by small differences in the initial knowledge with which Krebs and KEKADA started. The authors also say that a manual simulation of the path that Krebs followed in a second discovery, that of the glutamine synthesis, is wholly consistent with the theory set forth by KEKADA. They therefore conclude that the structure and heuristics in KEKADA constitute a model of discovery that is of wider applicability than the episode used to develop the system, and that the system is therefore not ad hoc.
More recently in “Two Kinds of Knowledge in Scientific Discovery” (2010) Langley and Bridewell at the Institute for the Study of Learning and Expertise in Palo Alto, CA, describe a computational approach that carries out search through a problem space for a “reasonable” explanation, i.e. one that is “interpretable”, because it is familiar to scientists. In general their approach models processes with constraints – the processes provide the content from which scientists construct models, while the constraints correspond to theoretical principles about how to combine processes. Their discovery system is called “inductive process modeling”, IPM, which they define as: given (1) observations for a set of continuous variables as they change over time, (2) generic entities that have properties relevant to the observed dynamics, (3) generic processes that specify causal relations among entities using generalized functional forms, and (4) a set of entities present in the modeled system – then find a specific process model that, when given initial values for the modeled variables and values for any exogenous variables, explains the observation data and predicts unseen data accurately. A system that carries out these steps would produce a model that links domain knowledge to scientific data, and importantly the model would explain the measured phenomena in a formalism much like a scientist’s own.
A technical justification for Simon’s heuristic-search approach followed by Langley is the view that the alternative combinatorial generate-and-test approach would require excessive computer resources. The search employs generic processes, which are a form of background knowledge that defines the space of candidate models, and modeling constraints that are another type of scientific knowledge. But the authors also invoke a philosophical justification for the IPM system design: they say that scientists call upon theory-level constraints, in order to exclude “implausible models”. Using theory-level constraints the system searches through a problem space for a “reasonable” explanation that is acceptable to scientists, because it is based on a relevant theory. This discovery strategy implements what Hickey calls “theory extension”. The authors also state that the causal explanatory content of the model stems from its relationship to scientific concepts and not from the equations themselves, and that equations without a theoretical interpretation provide a description of system dynamics, but are not explanations. Thus two kinds of scientific knowledge are distinguished: theory-based “explanations” and databased “descriptions”. This philosophy is at variance with the contemporary pragmatism and its theses of relativized semantics and ontological relativity.
The authors illustrate the system to develop the Lotka–Volterra equations for population dynamics in protists. Stereotypically foxes prey upon rabbits, which in the absence of the predators would overgraze thus starving the rabbits and then the foxes. This is a “quantitative process model”, which in “Discovering Ecosystem Models from Time-Series Data” (2003) by Langley et. al. is defined as a set of processes, each specifying one or more algebraic or differential equations that denote causal relations among variables along with optimal activation conditions. The IPM system uses a nonlinear optimization routine called “beam search” to estimate parameter values in the equations. In computer science beam search is a search algorithm that explores by expanding the most promising node or state in a tree diagram. At each level of the tree it generates all successors of the states at the current level, sorts them, and then stores only a predetermined number of best states at each level, called the “beam width”. But it is not an optimizing algorithm, because at the end of the tree the search may or may not have found the optimum state.
In their Lotka-Volterra demonstration implementation the entities with types predator and prey, each type has a variable that stores its respective population size. The system includes processes and entities related to population dynamics using predator–prey experiments between microscopic species using time-series data collected by Jost and Adiriti reported in “Identifying predator-prey processes from time series” in Theoretical Population Biology, 57 (2000) 325-337 and by Veilleux reported in “An analysis of predatory interaction between Paramecium and Didinium” in The Journal of Animal Ecology, 48, (1979) 787–803. The model generated by IPM successfully predicted that, when the predator population is high, the prey population decreases exponentially with predation controlled by multiplicative equations that add predators for each prey that is consumed.
In their “Integrated Systems for Inducing Spatio-Temporal Process Models” (2010) Pat Langley, Chunki Park, and Will Bridewell describe a more sophisticated system they call SCISM, an “integrated intelligent system”. SCISM solves the task of IPM systems that account for spatial and temporal variation, and is furthermore integrated with a constraint learning method to reduce computation during induction. Once provided with background knowledge consisting of spatio-temporal data and the knowledge encoded in a library of generic processes and entities, SCISM has a learning component that searches through the space of possible models. This part of the system integrates an algorithm for exploring the space of model structures with one for estimating the parameters of a particular structure. The combined procedure for model generation has three steps: 1. Generate all possible instantiations of generic processes with specific entities but without parameter values. 2. Combine instantiated processes to form a generic model that satisfies all the structural constraints. 3. Estimate the parameter values and scores each model’s fit to the data. After this search the system returns the quantitative process model that best accounts for the data.
Two decades earlier Langley and Shrager had described their philosophy of science more elaborately in Computational Models of Scientific Discovery and Theory Formation (1990). The book reports on a symposium with twenty-four contributors including Simon, Thagard and Langley. In the introductory chapter titled “Computational Approaches to Discovery” the editors affirm the cognitive-psychology conceptualization of the computational approach, and divide scientific behavior into “knowledge structures” and “knowledge processes”.
The knowledge structures include: (1) “observations”, which represent recordings of the environment made by sensors or measuring instruments, (2) “taxonomies”, which define or describe concepts for a domain along with specialization relations among them, (3) “laws”, which are statements that summarize relations among observed variables, objects or events, (4) “theories”, which are hypotheses about the structures or processes in the environment, and which describe unobservable objects or mechanisms, (5) “background knowledge”, which is a set of beliefs or knowledge about the environment aside from those that are specifically under study, (6) “models”, which are descriptions of the environmental conditions for an experimental or observational setting, (7) “explanations”, which are narratives that connect a theory to a law by a chain of inferences appropriate to the field.
Langley proposes that the knowledge processes that use these structures should include the following: (1) “the “observation process” inspects the environmental setting by training an instrument, sometimes only the agent’s senses, on that setting to produce a concrete description, (2) “taxonomy formation and revision” involves the generation of empirical laws that cover observed data, (3) “theory formation and revision” generates a theory from which one can derive the laws for a given model by explanation, thereby interconnecting a set of laws into a unified account, (4) “deductive law formation” produces laws from a theory by using an explanatory framework to deduce both a law and an explanation of how that law derives from the theory, (5) “the “explanation” process connects a theory to a law by a narrative whose general form is given by the field’s explanatory framework, (6) “experimental design” generates models of settings in which observations are made.
The authors call the above conceptualization the “classical view of science”. To the extent that there is any systematic philosophy of language, the philosophy of science is unquestionably positivist. It has positivism’s identifying dichotomy between theories and laws on the basis of unobservables that echoes Mach and other earlier positivists, and its characteristic organization of levels consisting of theories explaining laws of laws explaining observations and data that echoes Duhem and the positivists of the Vienna Circle. On Langley’s version the models function as the empirical laws in Duhem’s schema. Like all positivist views, Langley’s “classical view” is a philosophy of science that is based on the old paradigm of Newtonian physics, not on the newer pragmatic paradigm of contemporary quantum theory. The stereotypic paradigm they seek to imitate is Kepler’s planetary laws of the orbit of Mars explained by Newton’s gravitational theory, while Kepler’s laws are viewed as merely descriptive summaries of the celestial observations of Mars. In other words theories “explain”, while laws or models merely “describe”. They believe that causal relations cannot be extracted from the models. This view is contrary to the contemporary pragmatist thesis, as for example the philosopher of science Russell Hanson set forth in his Patterns of Discovery.
Langley’s systems are not without interest, because there are problems in basic research that can be addressed effectively with the design of the IPM and SCISM systems. But these systems are not for big-game hunting, as it were, for new contributions to science; they are for hunting hares rather than hippos, because if the user inputs familiar theories, he will get only those familiar theories for output, and will get nothing newer much less fundamentally superior. In general the more old knowledge that is built into a system, the less new knowledge that can come out of it. Familiarity in the output may gain acceptance among the conventionally minded, but familiarity is a high price to pay at the expense of discovery; it is a Faustian bargain. The scientists whose practices are modeled by such theory-driven systems suggest what Thomas Kuhn called “normal science” – the detailing and extension of accepted paradigms, such as what Langley calls “generic processes”. The aim of Kuhn’s “normal” science is the further articulation of the familiar paradigm by a “puzzle-solving” type of research uncritical of the paradigm.
Langley’s IPM strategy applied in economics would amount to automating the Haavelmo agenda: The generic processes that are theory-inspired and deemed “causal” are the concepts of supply and demand, the generic entities are the quantities demanded and supplied, the relative price and the aggregate income constraint, and the observations are the time-series measurements for a specific industry. But historically it was fidelity to these familiar classical ideas that proved to be the biggest obstruction to recognition of a distinctive macroeconomic perspective at the time of Keynes’ General Theory. Today Langley’s positivist ideas also echo the views of the handful of sociologists who attempt sociometric modeling while demanding sociopsychological-causal “explanations”.
It is furthermore ironic that in this book these authors should reference philosophers such as Kuhn and Feyerabend, who truculently rebelled against this “classical” positivist view. As it happens, neither Langley’s “representational structures” nor his “mechanized algorithmic processes” in discovery systems designs need be cast in the positivist context, nor need they be conceptualized in the psychologistic terms of cognitive psychology. The system designs can be conceptualized in pragmatic terms and described as language-processing systems. Furthermore scientists became pragmatic, because their historic and greatest discoveries including notably quantum theory were not what they had expected or found “reasonable”. They accepted the unexpected and “unreasonable”, because the new finding was empirically more adequate.
Nor need the “representational structures” and “mechanized algorithmic processes” in discovery systems be conceptualized in psychologistic terms. As it happens, in “Processes and Constraints in Explanatory Scientific Discovery” (2008) Langley and Bridewell appear to depart from the cognitive psychology interpretation of their IPM discovery systems. They state that they have not aimed to “mimic” the detailed behavior of human researchers, but that their systems address the same tasks as ecologists, biologists, and other theory-guided scientists, and that their systems carry out search through similar problem spaces. They have thus taken a step toward pragmatism and away from psychologism.
Simon’s Philosophy of Science
Simon’s literary corpus is rich enough to contain a philosophy of science that addresses all four of the functional topics.
Aim of Science
What philosophers of science call the aim of science may be taken as a rationality postulate for basic scientific research. In his autobiography in an “Afterword” titled “The Scientist as Problem Solver” Simon explicitly applies his thesis of bounded rationality developed for economics to scientific research. This explicit statement would not have been necessary for the attentive reader of his literary corpus. He describes his concept of scientific discovery as a special case of his concept of human problem solving, because both concepts are based on his strategy of heuristic search. And he views his strategy of heuristic search in turn as a special case of his postulate of bounded rationality.
To this metascientific concept one need only add that Simon’s application of his thesis of bounded rationality to scientific discovery amounts to his thesis of the aim of science. The function of heuristics is to search efficiently a problem space of possible satisficing solutions, which is too large to be searched exhaustively. The limited computational ability of the scientist relative to the size of the problem space is the “computational constraint”, an incidental circumstance that bounds the scientist’s rationality and constrains the scientist from global optimization of solution search. The research scientist is therefore a satisficer, and the aim of the scientist is satisficing within both the institutional empirical and the incidental computational constraints.
Simon’s views on explanation and criticism may also be considered in relation to the discovery systems. Consider firstly his statements on scientific explanation including the topic of theoretical terms. The developers of the BACON systems make a pragmatic distinction between observation variables and theoretical variables in their systems. Simon notes that contemporary philosophers of science maintain that observation is theory laden, and his distinction between observational and theoretical terms does not deny this semantical thesis. He calls his distinction “pragmatic”, because he makes it entirely relative to the discovery system. When he makes the distinction, variables that have their associated numeric values assigned before the system is run are considered to be observational variables, while those that receive their values by the operation of the discovery system are considered to be theoretical variables. Thus Langley considers all the values created by the BACON programs by multiplication or division for finding products or ratios to be theoretical terms. And Simon accordingly calls the values for nominal variables that are postulated intrinsic properties to be theoretical terms. Simon also states that in any given inquiry we can treat as observable any term whose values are obtained from an instrument that is not itself problematic in the context of that inquiry. This definition is compatible with the contemporary pragmatist sense, in which observation language is merely test-design language given particular logical quantification.
Unfortunately Simon does not follow through with this pragmatic relativizing of semantics to problem-solving discovery systems, but reverts to the positivist concept of explanation. In his exposition of DALTON, which creates structural theories, Simon comments that as an area in science matures its researchers progress from “descriptions” to “explanations”, and he cites Hempel’s Aspects of Scientific Explanation and Other Essays (1965). Examples of explanations cited by Simon are the kinetic theory of heat, which provides an explanation of both Black’s law and the ideal gas law, and Dalton’s atomic theory, which provides explanations for the law of multiple proportions, and Gay-Lussac’s law of combining volumes. He notes that these examples involve a structural model in which macroscopic phenomena are described in terms of their inferred component atoms.
Simon contrasts explanation to the purely phenomenological and descriptive analyses carried out by BACON.4, when it rediscovered the concepts of molecular and atomic weight, and assigned correct weights to many substances in its wholly measurement-data-driven manner. He affirms that BACON.4’s analyses involved no appeal to a particulate model of chemical elements and compounds, and that what took the place of the atomic model were the heuristics that searched for small integer ratios among corresponding properties of substances. This concept of explanation is a reversion to the three-level hypothetical-deductive concept of explanation in which theories are said to “explain” deductively empirical laws, and the empirical laws in turn deductively explain observation reports of particular events. In this view theories and empirical generalizations are distinguished by their semantics.
On the pragmatist view theory and empirical description are not distinguished semantically, but are distinguished pragmatically by their use in basic-science research. Theory is what is proposed for empirical testing, and description in test design is what is presumed for testing. Explanation employs language that was theory but then made into law after it has been empirically tested and not falsified. One who speaks of “theoretical explanation” is thus merely speaking of a proposed explanation, which is an antilogous concept. The pragmatist concept is a functional or research-science view of the language of science instead of the positivist catalogue-science view. Thus given that the discovery systems are problem-solving systems, defining “theory” and “explanation” relative to the discovery system is to define them consistently with the pragmatist philosophy.
In addition to the physical theories that the discovery systems rediscovered, consideration might also be given to the behavioral and social theories that Simon and his colleagues had not attempted to address with their discovery systems. Why did this Nobel-laureate economist never attempt to construct an economic theory with a discovery system? Perhaps one might ask instead: is Simon actually a romantic in his philosophy of social science? One possible answer is that no economic theory embodying his thesis of bounded rationality lends itself to creation by any discovery system like those that he or his colleagues have yet designed. So, there is irony here. Simon’s discovery systems are purportedly explorations in cognitive psychology with his heuristic-search system design exhibiting his thesis of bounded rationality. But the subjects to which his heuristic-search system design should be applicable cannot be the romantic’s subjective mental states such as motives and values or the bounded-rational deliberative processes of human subjects.
Simon’s view of scientific criticism is based on his theory of heuristics and discovery systems. Philosophers of science such as Hanson, whose interests were focused on the topic of scientific discovery, found that the positivist separation of the “context of discovery” and the “context of justification” fails to recognize the interdependence between these two functions in scientific research. Simon also notes this interaction between discovery and justification in Scientific Discovery, because it is integral to the heuristic procedure in his discovery system designs. His principal thesis of problem solving is that the availability of evaluative tests during the successive stages of the discovery process carried out by the heuristics is a major source of the efficiency of the discovery methods. The product of each step of a search is evaluated in terms of the evidence it has produced, and the search process is modified on the basis of the outcome of these evaluations. Yet Simon does not fail to see the need for predictive testing by observation or experiment of the hypotheses generated by the discovery systems, which only find patterns in limited available data.