Download as .pdf or .zip

 

All Books are in Adobe Acrobat PDF Format.
You may download a free plug-in here.

Download Winzip for
[Windows] or [Mac]
 
BOOK VIII - Page 2
 
  HERBERT SIMON, PAUL THAGARD AND OTHERS ON
DISCOVERY SYSTEMS
 
 

 

Thagard on Discovery by Analogy and Systems ACME and ARCS

In Conceptual Revolutions Thagard distinguishes three types or methods of scientific discovery.  They are: 1) data-driven discovery by simple abduction to make empirical generalizations from observations and experimental results, 2) explanation-driven discovery using existential abduction and rule abduction to form theories referencing theoretical entities, and 3) coherence-driven discovery by making new theories due to the need to overcome internal contradictions in existing theories.  To date Thagard has offered no discovery system that creates new theories by the coherence-driven method, but the other two methods have been implemented in his cognitive systems.
            Consider firstly generalization.  The central activity of artificial-intelligence system PI is problem solving with the goal of creating explanations.  The system represents knowledge by rules and concepts with nodes in a network representing concepts and the rules linking the nodes representing propositions.  Generalization is the formation of general statements, such as may have the simple form “All X are Y.”  The creation of such rules by empirical generalization is implemented in PI, which takes into account both the number of instances supporting a generalization, and the background knowledge of the variety in the kinds of instances involved.
             Consider next abduction.  By “abduction” Thagard means inference to a hypothesis that offers a possible explanation of some puzzling phenomenon.  The PI system contains three complex data structures or data-types in named LISP property lists, which are called “messages”, “concepts”, and “rules.”  The messages data-type represents particular results of observations and inferences.  The concept data-type locates a concept in a hierarchical network of kinds and subkinds.  The concepts manage storage for abductive problem solving.  The rules data-type represents laws in an “if…then” form, and also contains a measure of strength.  The system fires rules that lead from the set of starting conditions to the goal of explanation.  Four types of abductive inference accomplish this goal: (1) Simple abduction, which produces hypotheses about individual objects; these hypotheses are laws or empirical generalizations. (2) Existential abduction, which postulates the existence of formerly unknown objects; this type results in theoretical terms referencing theoretical entities, which is discussed in the previous section above.  (3) Rule-forming abduction, which produces rules that explain other rules; these rules are theories that explain laws. Since Thagard retains a version of the doctrine of theoretical terms referencing theoretical entities, he advocates the Positivists’ traditional three-layered view of the structure of scientific knowledge consisting of (a) observations expressed in statements of evidence, (b) laws based on generalization from the observations, and (c) theories, which explain the laws. (4) Analogical abduction, which uses past cases of hypothesis formation to generate hypotheses similar to existing ones.
             Consider specifically analogy.  This topic is treated at length in Thagard’s Mental Leaps: Analogy in Creative Thought (1995) co-authored with Holyoak.  In this book the authors propose a general theory of analogical thinking, which they illustrate in a variety of applications drawn from a wide spectrum.  Thagard states that analogy is a kind on nondeductive logic, which he calls “analogic.”  Analogic contains two poles, as it were.  They are firstly the “source analogue”, which is the known domain that the investigator already understands in terms of familiar patterns, and secondly the “target analogue”, which is the unfamiliar domain that the investigator is trying to understand.  Analogic then consists of the way the investigator uses analogy to try to understand the targeted domain by seeing it in terms of the source domain, and it involves a mental leap, because the two analogues may initially seem unrelated, but the act of making the analogy creates new connections between them.  Thagard calls his theory of analogy the “multiconstraint theory”, because he identifies three regulating constraints: (1) similarity, (2) structure, and (3) purpose.  Firstly the analogy is guided by a direct similarity between the elements involved.  Secondly it is guided by proposed structural parallels between the roles in the source and target domains.  And thirdly the exploration of the analogy is guided by the investigator’s goals, which provide the purpose for considering the analogy.  Thagard lists four purposes of analogies in science.  They are (1) discovery, (2) development, (3) evaluation, and (4) exploration.  Discovery is the formulation of a new hypothesis.  Development is the theoretical elaboration of the hypothesis.  Evaluation is arguments given for its acceptance.  And exploration is the communication of new ideas by comparing them to the old ones.  He notes that some would keep evaluation free of analogy, but he maintains that to do so would contravene practice of several historic scientists.  Each of the three regulating constraints - similarity, structure, and purpose - is operative in four steps that Thagard distinguished in the process of analogic: (1) selecting, (2) mapping, (3) evaluating, and (4) learning.  Firstly the investigator selects a source analogy often from memory.  Secondly he maps the source to the target to generate inferences about the target.  Thirdly he evaluates and adapts these inferences to take account of unique aspects of the target.  And finally he learns something more general from the success or failure of the analogy.
             Thagard notes two computational approaches for the mechanization of analogic: the “symbolic” approach and the “connectionist” approach. The symbolic systems represent explicit knowledge, while the connectionist systems can only represent knowledge implicitly as the strengths of weights associated with connected links of neuron-like units in networks.  Thagard says that his multiconstraint theory of analogy is implemented computationally as a kind of hybrid combining symbolic representations of explicit knowledge with connectionist processing.  Thagard and Holyoak have developed two analogic systems: ACME (Analogical Constraint Mapping Engine) and more recently ARCS (Analog Retrieval by Constraint Satisfaction).  Reflecting in 1987 on interpreting the Necker cube, a kind of ambiguous drawing, Holyoak and Thagard worked together to develop a procedure whereby a network could be used to perform analogical mapping by simultaneously satisfying the four constraints.  Their result was the ACME system.  This system mechanizes the mapping problem.  It creates a network when given the source and target analogues, and a simple algorithm updates the activation of each unit in parallel, to determine which mapping hypothesis should be accepted.  ARCS deals with the more difficult problem of retrieving an interesting and useful source analog from memory in response to a novel target analog, and it must do so without having to consider every potential source analog in the memory.  The capability of matching a given structure to those stored in memory that have semantic overlays with it is facilitated by information from WORDNET, an electronic thesaurus in which a large part of the English language is encoded.  The output from ARCS is then passed to ACME for mapping. 

Thagard on Criticism by “Explanatory Coherence”

          Thagard’s theory of explanatory coherence set forth in detail in his Conceptual Revolutions describes the mechanisms whereby scientists choose to abandon an old theory with its conceptual system, and accept a new one.  He sets forth a set of principles that enable the assessment of the global coherence of an explanatory system.  Local “coherence” is a relation between two propositions.  The term “incohere” means that more than just two propositions do not cohere; i.e. they resist holding together.  The terms “explanatory” and “analogous” are primitive terms in the theory, and the following principles define the meaning of “coherence” and “incoherence” in the context of his principles, as paraphrased and summarized below:

 Symmetry.

If propositions P and Q cohere or incohere, then Q and P cohere or incohere respectively.

Coherence.

The global explanatory coherence of a system of propositions depends on the pairwise local coherence of the propositions in the system.

Explanation.

If a set of explanatory propositions explain proposition Q, then the explanatory propositions in the set cohere with Q, and each of the explanatory propositions cohere with one another. 

Analogy.

If P1 explains Q1, P2 explains Q2, and if the P’s are analogous to each other and the Q’s are analogous to each other, then the P’s cohere with each other, and the Q’s cohere with each other.

Data Priority.

Propositions describing the results of observation are evidence propositions having independent acceptability.

Contradiction.

            Mutually contradictory propositions incohere.

Competition.

Two propositions incohere if both explain the same evidence proposition and are not themselves explanatorily connected. 

Acceptability.

The acceptability of a proposition in a system of propositions depends on its coherence with those propositions. Furthermore the acceptability of a proposition that explains a set of evidence propositions is greater then the acceptability of a proposition that explains only a subset or less than the number in the set including a subset.

          Thagard’s theory of explanatory coherence is implemented in a computer system written in the LISP computer language that applies connectionist algorithms to a network of units.  The system is called ECHO (Explanatory Coherence by Harmony Optimization).  Although Thagard mentions a coherence-driven discovery method, his ECHO system is not a discovery system.  Before execution the operator of the system inputs the propositions for the conceptual systems considered by the system, and also inputs instructions identifying which hypothesis propositions explain which other propositions, and which propositions are observation reports and have evidence status.  In ECHO each proposition has associated with it two values, a weight value and an activation value.  A positive activation value represents a degree of acceptance of the hypothesis or evidence statement, and a negative value the degree of rejection.  The weight value represents the explanatory strength of the link between the propositions.  When one of principles of explanatory coherence in the above list says that one proposition coheres with another, an excitatory link is established between the two propositions in the computer network.  And when one of the principles says that two incohere, an inhibitory link is established.  In summary in the ECHO system network: (1) A proposition is a unit in the network.  (2) Coherence is an excitatory link between units with activation and weight having a positive value, and incoherence is an inhibitory link with activation and weight having a negative value.  (3) Data priority is an excitatory link from a special evidence unit.  (4) Acceptability of a proposition is activation.  Prior to execution the operator has choices of parameter values that he inputs, which influence the system’s output.  One of these is the “tolerance” of the system for alternative competing theories, which is measured by the absolute value of the ratio of excitatory weights to inhibitory weights.  If the tolerance parameter is low, wining hypotheses will deactivate losers, and only the most coherent will be outputted.
             When ECHO runs, activation spreads from the special evidence unit to the data represented by evidence propositions, and then to the explanatory hypotheses, preferring those that firstly explain a greater breadth of the evidence than their competitors, and secondly explain with fewer propositions, i.e. are simpler. But the system prefers unified theories to those that explain evidence with special ad hoc hypotheses for each evidence statement explained. Thagard says that by preferring theories that explain more hypotheses, the system demonstrates the kind of conservatism seen in human scientists when selecting theories.  And he says that like human scientists ECHO rejects Popper’s falsificationism, because ECHO does not give up a promising theory just because it has empirical problems, but rather makes rejection a matter of choosing among competing theories.  However, whether scientists reject theories in isolation depends on how one individuates theories, and Thagard offers no criterion for individuating theories.  If theories are individuated semantically, then when a theory makes inaccurate predictions, the response by scientists is to change the theory, thereby ipso facto creating a new theory regardless of whether there are alternatives.  But ECHO is not a discovery system, and therefore is not designed to make this kind of response.  And thirdly the system prefers explanations that are analogous to other previously successful explanations.  In his Computational Philosophy of Science he notes that many philosophers of science would argue that analogy is at best relevant to the discovery of theories but has no bearing on their justification.  But he maintains that the historical record, such as Darwin’s defense of natural selection, shows the need to include analogy as one of the criteria for the best explanation among competing hypotheses.  In summary, therefore, other things being equal activation accrues to units corresponding to hypotheses that: (1) explains more evidence, (2) provide simpler explanations, or (3) are analogous to other explanatory hypotheses.  These three criteria are also operative in his earlier PI system, where breadth is called “consilience.” During execution the system proceeds through a series of iterations adjusting the weights and activation levels, in order to maximize the coherence of the entire system of propositions. Thagard calls the network “holistic” in the sense that the activation of every unit can potentially have an affect on every other unit linked to it by a path, however lengthy.  Usually not more than one hundred cycles are needed to achieve stable optimization.  The maximized coherence value is calculated as the sum of each of the weight values multiplied by the activation value of the propositions associated with each weight. 
             Thagard has applied system ECHO to several revolutionary episodes in the history of science.  These include: (1) Lavoisier’s oxygen theory of combustion, (2) Darwin’s theory of the evolution of species, (3) Copernicus’ heliocentric astronomical theory of the planets, (4) Newton’s theory of gravitation, and (5) Hess’ geological theory of plate tectonics.  In reviewing his historical simulations Thagard reports that the criterion in ECHO having the largest contribution to explanatory coherence in scientific revolutions is explanatory breadth – the preference for the theory that explains more evidence than its competitors – as opposed to the other criteria of simplicity and analogy.
             ECHO seems best suited either to evaluate nonmathematically expressed alternative theories, or to evaluate mathematically expressed alternative theories in only certain circumstances.  Scientists like to quantify phenomena, so that they can compare the prediction errors in their theories net of the estimated measurement error.  They estimate measurement error by repetition of the measurement procedures, and they reduce it by improvement in their experimental designs.  It is in cases of empirical indeterminacy that considerations such a breadth, simplicity, and analogous similarity may operate in the scientists’ preferences among mathematically expressed theories.  Those are cases of nonfalsified theories having prediction errors that are large relative to measurement error, yet small relative to the deviations between the alternative theories’ prediction errors, such that the measurement error makes the theories empirically indistinguishable.

Thagard on Explanation and the Aim of Science

             Thagard’s views on the three levels of explanation were mentioned above, but he has also made some other statements that warrant mention.  In Conceptual Revolutions he distinguishes six different approaches to the topic of scientific explanation in the philosophy of science literature, the first five of which he finds are also discussed in the artificial-intelligence literature.  The six types are: (1) deductive, (2) statistical, (3) schematic - which uses organized patterns, (4) analogical, (5) causal – which he opposes to specious correlation, and (6) linguistic/pragmatic.  For the last he finds no correlative in the artificial-intelligence literature.  Thagard says that he views these approaches as different aspects of explanation, and that what is needed is a theory of explanation that integrates all these aspects. He says that in artificial intelligence such integration is called a cognitive architecture, by which is meant a general specification of the fundamental operations of thinking, and he references Simon’s General Problem Solver agenda.  He adds that some of these approaches may operate as subprocesses in the complex process of explanation.
             The topic of the aim of science has special relevance to Thagard’s philosophy, since he defines computational philosophy of science as normative cognitive psychology.  Thagard’s discussions of his theory of inference to the best explanation implemented in his system PI set forth in Computational Philosophy of Science and his later statement as the theory of optimized explanatory coherence implemented in his system ECHO set forth in Conceptual Revolutions, reveal much of his view on the aim of science.  His statement of the aim of science might be expressed as follows: to develop hypotheses with maximum explanatory coherence including coherence with statements reporting available empirical findings. He notes that no rule relating concepts in a conceptual system will be true in isolation, but he maintains that the rules taken together as a whole in a conceptual system constituting an optimally coherent theory provide a set of true descriptions.  In Computational Philosophy of Science Thagard states that his theory of explanatory coherence is compatible with both realist and nonrealist philosophies.  But he maintains that science aims not only to explain and predict phenomena, but furthermore to describe the world as it really is, and he explicitly advocates the philosophical thesis of scientific realism, which he defines as the thesis that science in general leads to truth.  Thagard’s concept of “scientific realism” seems acceptable as far as it goes, but it does not go far enough.  The meaning of “scientific realism” in the contemporary Pragmatist philosophy of science is based upon the subordination of ontological claims to empirical criteria in science, a subordination that is due to the recognition of ontological relativity.

Herbert Simon and Logic Theorist

          Herbert Simon (1916-2001), born in Milwaukee, Wisconsin, entered the University of Chicago in 1933 where he received a BA degree in 1936 and a Ph.D. in political science in 1942.  He was awarded the Nobel Memorial prize for economics in 1978.  He spent his career as a faculty member at Carnegie-Mellon University in Pittsburgh, most of it in the Graduate School of Industrial Administration, and later as a faculty member in both the Psychology and Computer Science Departments and also as a member of the University's board of trustees.  His autobiography, Models of My Life, was published in 1991.
          In his autobiography he reports that the most important years of his life were 1955 and 1956, when his interest turned from administration and economics to the psychology of human problem solving, and specifically to considering the symbolic processes that people use in thinking.  He and his long-time collaborator, Alan Newell, had concluded that computers could be applied generally to imitating intelligence symbolically, instead of just numerically, an insight that Simon says is a crucial step required for genuine artificial intelligence to emerge.  In 1956 his first artificial-intelligence system named LOGIC THEORIST used his “heuristic search” methods to develop deductive logic proofs of the theorems in Whitehead and Russell's Principia Mathematica, the seminal text for the Russellian symbolic logic.  However, the fact that this system found proofs in formal logic is purely incidental; Simon rejects the view held by some artificial-intelligence advocates, that formal logic is the appropriate language for artificial-intelligence systems and that problem solving is merely a process of proving theorems.  The significance of LOGIC THEORIST is its use of the authors’ “heuristic search” methods and of symbol manipulation.  Simon defines artificial intelligence as symbolic processing, and he defines cognitive psychology as understanding human thinking by modeling ordinary problem solving with artificial-intelligence systems.

 

 

Pages [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [Next 10 >>]
NOTE: Pages do not corresponds with the actual pages from the book

 

Web Design by Global Nexchange Solutions