INTRODUCTION TO PHILOSOPHY OF SCIENCE
Book I Page 6
4.14 Scientific Criticism
Criticism pertains to the criterion for the acceptance or rejection of theories. The only criterion for scientific criticism that is acknowledged by the contemporary realistic neopragmatist is the empirical criterion.
The philosophical literature on scientific criticism has little to say about the specifics of experimental design, as might be found in various college-level science laboratory manuals. Most often philosophical discussion of criticism pertains to the criteria for acceptance or rejection of theories and more recently to the effective decidability of empirical testing that has been called into question due to the wholistic semantical thesis.
In earlier times when the natural sciences were called “natural philosophy” and social sciences were called “moral philosophy”, nonempirical considerations operated as criteria for the criticism and acceptance of descriptive narratives. Even today some philosophers and scientists have used their semantical and ontological preconceptions as criteria for the criticism of theories including preconceptions about causality or specific causal factors. Such semantical and ontological preconceptions have misled them to reject new empirically superior theories.
What historically has separated the empirical sciences from their origins in natural and moral philosophy is the empirical criterion. This criterion is responsible for the advancement of science and for its enabling practicality in application. Whenever in the history of science there has been a conflict between the empirical criterion and any nonempirical criteria for the evaluation of new theories, it is eventually the empirical criterion that ultimately decides theory selection. Contemporary realistic neopragmatists accept relativized semantics, scientific realism, and ontological relativity, and they therefore reject all prior semantical or ontological criteria for scientific criticism including the romantics’ mentalistic ontology requiring social-psychological or any other kind of reductionism.
4.15 Logic of Empirical Testing
Different sciences often have different surface structures, which may involve complex mathematics. But the syntactical transformation of the surface structure of a theory into the nontruth-functional hypothetical-conditional logical form is the philosopher’s heuristic enabling a rational reconstruction that produces the deep structure of the theory and explicitly displays the contingency of the empirical test and its logic.
The deep structure of the language of an empirical test exhibits:
(1) an effective decision procedure that can be expressed as a modus tollens logical deduction from a set of one or several universally quantified theory statements expressed in a nontruth-functional hypothetical-conditional form
(2) together with a particularly quantified antecedent description of the test protocols and the initial test conditions as defined in the test design
(3) that jointly conclude to a consequent particularly quantified description of a produced (predicted) test-outcome event
(4) that is compared with the observed test-outcome description.
In order to express explicitly the dependency of the produced effect upon the realized initial conditions in an empirical test, the universally quantified theory statements can be syntactically transformed into a nontruth-functional hypothetical-conditional deep structure, i.e., as a statement with the logical form “For every A if A, then C.”
This nontruth-functional hypothetical-conditional schema “For every A if A, then C” represents a system of one or several universally quantified and typically interdependent theory statements or equations that describe a dependency of the occurrence of events described by “C” upon the occurrence of events described by “A”. In some cases the dependency is expressed as a bounded stochastic density function for the values of predicted probabilities. For advocates who believe in the theory, the nontruth-functional hypothetical-conditional schema is the theory-language context that contributes meaning parts to the complex semantics of the theory’s constituent descriptive terms including notably the terms common to the theory and test design. But the theory’s semantical contribution cannot be operative in a test for the test to be independent of the theory, since the test outcome is not true by definition; it is empirically contingent and the test-design terms must remain vague with respect to the theory.
The antecedent “A” also includes the set of universally quantified statements of test design that describe the initial conditions that must be realized for execution of an empirical test of the theory including the protocol statements describing the measurement and setup procedures needed for their realization. These statements constituting “A” are always presumed to be true or the test design is rejected as invalid, as is any test made with it. The test-design statements are semantical rules that contribute meaning parts to the complex semantics of the terms common to theory and test design, and do so independently of the theory’s semantical contributions. The universal logical quantification indicates that any execution of the test is but one of an indefinitely large number of possible test executions, whether or not the test is repeatable at will.
When the test is executed, the logical quantification of “A” is changed from universal to particular quantification to describe the realized initial conditions in the individual test execution. And the particular quantification of “A” makes the nontruth-functional hypothetical-conditional statement also particularly quantified, to make a prediction or to describe a produced effect. When the universally quantified test-design and test-outcome statements have their logical quantification changed to particular quantification, the belief status and thus the definitional rôle of the universally quantified test-design confer upon their particularly quantified versions the status of “fact” for all who decided to accept the test design. Nietzsche (1844-1900) said that there are no facts; there are only interpretations. Hickey says that due to relativized semantics with its empirical underdetermination and due to its ontological relativity with its consequent referential inscrutability, all facts are interpretations of reality. Failure to recognize the interpreted character of facts is to indulge in what Wilfred Sellars (1912-1989) in his Science, Perception and Reality (1963) called “the myth of the given”, a phrase earlier used by Dewey.
The theory statement need only say “For every A if A, then C”. The nontruth-functional hypothetical-conditional statement expressing a theory need not say “For every A and for every C if A, then C, (or “For every A if and only if A, then C” or “For every A, iff A, then C”) unless the nontruth-functional hypothetical-conditional statement is convertible, i.e., a biconditional statement, also saying “For every C if C, then A”. The uniconditional “For every A if A, then C” is definitive of functional relations in mathematically expressed theories. In other words the nontruth-functional hypothetical-conditional statement of theory need only express a sufficient condition for the correct prediction made in C upon realization of the test conditions described in “A”, and not a necessary condition. This may occur when scientific pluralism (See below, Section 4.20) occasions multiple theories proposing alternative causal factors for the same outcome predicted correctly in “C”. Or when there are equivalent measurement procedures or instruments described in “A” that produce alternative measurements with each having values falling within the range of the others’ measurement errors.
The theory statements in the nontruth-functional hypothetical-conditional deep structure are also given particular quantification for the test execution. In a mathematically expressed theory the test execution consists in measurement actions and assignment of the resulting measurement values to the variables in “A”. In a mathematically expressed single-equation theory, “A” includes the independent variables in the equation of the theory and in the test procedure. In a multi-equation system whether recursively structured or simultaneous, all the exogenous variables are assigned values by measurement, and are included in “A”. In longitudinal models with dated variables ‘A’ must also include the lagged-values of endogenous variables that are the initial condition for a test and that initiate the recursion through successive iterations to generate predictions.
The consequent “C” represents the set of universally quantified statements of the theory that predict the outcome of every correct execution of a test design. The nontruth-functional hypothetical-conditional’s logical quantification is changed from universal to particular quantification to describe the predicted outcome for the individual test execution. In a mathematically expressed single-equation theory the dependent variable of the theory’s equation is in “C”. When no value is assigned to any variable, the equation is universally quantified. When the predicted value of a dependent variable is calculated from the measurement values of the independent variables, the equation has been particularly quantified. In a multi-equation theory, whether recursively structured or a simultaneous-equation system, the solution values for all the endogenous variables are included in “C”. In longitudinal models with dated variables “C” includes the current-dated values of endogenous variables for each iteration of the model, which are calculated by solving the model through successive iterations.
Let another particularly quantified statement denoted “O” describe the observed test outcome of an individual test execution. The report of the test outcome “O” shares vocabulary with the prediction statements in “C”. But the semantics of the terms in “O” is determined exclusively by the universally quantified test-design statements rather than by the statements of the theory, and thus for the test its semantics is independent of the theory’s semantical contribution and vague about the theory’s content and claims. In an individual test execution “O” represents observations and/or measurements made and measurement values assigned apart from the prediction in “C”, and it too has particular logical quantification to describe the observed outcome resulting from the individual execution of the test. There are then three possible outcome scenarios:
Scenario I: If “A” is false in an individual test execution, then regardless of the truth of “C” the test execution is simply invalid due to a scientist’s failure to comply with the agreed protocols in the test design, and the empirical adequacy of the theory remains unaffected and unknown. The empirical test is conclusive only if it is executed in accordance with its test design. Contrary to the logical positivists, the truth table for the truth-functional logic is therefore not applicable to testing in empirical science, because in science a false antecedent, “A”, does not make the nontruth-functional hypothetical-conditional statement true by logic of the test.
Scenario II: If “A” is true and the consequent “C” is false, as when the theory conclusively makes erroneous predictions, then the theory is falsified, because the nontruth-functional hypothetical-conditional “For every A if A, then C” is false by logic. Falsification occurs when the prediction statements in “C” and the observation reports in “O” are not accepted as describing the same thing within the range of vagueness and/or measurement error that are manifestations of empirical underdetermination. The falsifying logic of the test is the modus tollens argument form, according to which the nontruth-functional hypothetical-conditional deep structure expressing the theory is falsified, when one affirms the antecedent clause and denies the consequent clause. This is the falsificationist philosophy of scientific criticism initially advanced by Peirce, the founder of classical pragmatism, and later advocated by Popper, who was a post-positivist but not a pragmatist. For more on Popper readers are referred to BOOK V at the free web site www.philsci.com or in the e-book Twentieth-Century Philosophy of Science: A History, which is available in the web site through hyperlinks in the web site to Internet booksellers.
The response to a conclusive falsification may or may not be attempts to develop a new theory. Responsible scientists will not deny a falsifying outcome of a test, so long as they accept its test design and test execution. Characterization of falsifying anomalous cases is informative, because it may contribute to articulation of a new problem that a new and more empirically adequate theory must solve. But some scientists may, as Kuhn said, simply believe that the anomalous outcome is an unsolved problem for the tested theory without attempting to develop a new theory. Such a response is simply a disengagement from attempts to solve the problem that the falsified theory had addressed. Contrary to Kuhn this procrastinating response to anomaly need not imply that the falsified theory has been given institutional status, unless the science itself is institutionally retarded.
For more on Kuhn readers are referred to BOOK VI at the free web site www.philsci.com or in the e-book Twentieth-Century Philosophy of Science: A History available at Internet booksellers through hyperlinks in the web site.
Scenario III: If “A” and “C” are both true, then the nontruth-functional hypothetical-conditional deep structure expressing the tested theory is validly accepted as asserting a causal dependency between the phenomena described by the antecedent and consequent clauses, even if the nontruth-functional hypothetical-conditional statement was merely an assumption. But the acceptance is not a logically necessary conclusion, because to say that it is logically necessary is to commit the fallacy of affirming the consequent. The acceptance is of an empirical and thus falsifiable statement. Yet the nontruth-functional hypothetical-conditional statement does not merely assert a Humean psychological constant conjunction. Causality is an ontological category describing a real dependency, and the causal claim is asserted on the basis of ontological relativity due to the empirical adequacy demonstrated by the nonfalsifying test outcome. Because the nontruth-functional hypothetical-conditional statement is empirical, causality claims are always subject to future testing, falsification, and then revision. This is also true when the nontruth-functional hypothetical-conditional represents a mathematical function.
But if the test design is afterwards modified such that it changes the characterization of the subject of the theory, then a previous nonfalsifying test outcome should be reconsidered and the theory should be retested for the new definition of the subject. If the retesting produces a falsifying outcome, then the new information in the modification of the test design has made the terms common to the two test designs equivocal and has contributed parts to alternative meanings. But if the new test outcome is not falsification, then the new information is merely new parts added to the meaning of the univocal terms common to the old and new test-design descriptions. Such would be the case for example for a new and additional way to measure temperature for extreme values that cannot be measured by the old measurement procedure, but which yields the same temperature values within the range of measurement errors, where the alternative procedures produce overlapping measurement results.
On the contemporary realistic neopragmatist philosophy a theory that has been tested is no longer theory, once the test outcome is known and the test execution is accepted as correct. If the theory has been falsified, it is merely rejected language unless the falsified theory is still useful for the lesser truth it contains. But if it has been tested with a nonfalsifying test outcome, then it is empirically warranted and thus deemed a scientific law until it is later tested again and falsified. The law is still hypothetical because it is empirical, but it is less hypothetical than it had previously been as a theory proposed for testing. The law may thereafter be used either in an explanation or in a test design for testing some other theory.
For example the elaborate engineering documentation for the Large Hadron Collider at CERN, the Conseil Européen pour la Recherche Nucléaire, is based on previously tested science. After installation of the collider is complete and it is known to function successfully, the science in that engineering is not what is tested when the particle accelerator is operated for the microphysical experiments, but rather the employed science is presumed true and contributes to the test design semantics for experiments performed with the accelerator.
4.16 Test Logic Illustrated
For theories using a mathematical grammar for their surface structures, the mathematical grammar in the object language is typically the most efficient and convenient way to express the theory and to test it. But philosophers of science may transform the mathematical forms of expression representing the surface structures into the deep-structure heuristic consisting of a nontruth-functional hypothetical-conditional schema that exhibits explicitly the empirical contingency expressed by the theory and its logic.
Consider the simple heuristic case of Gay-Lussac’s law for a fixed amount of gas in an enclosed container as a theory proposed for testing, a case in which the surface structure is the mathematical equation, which can be transformed into the deep structure expressed as a nontruth-functional hypothetical-conditional sentence. The container’s volume is constant throughout the experimental test, and therefore is not represented by a variable. The mathematical equation that is the surface structure of the theory is T'/T = P'/P, which can be transformed to (T'/T)*P = P', where the variable P means gas pressure, the variable T means the gas temperature, and the variables T' and P' are incremented values for T and P in a controlled experimental test, where T' = T ± ΔT, and P' is the predicted outcome that is produced by execution of the test design. The statement of the theory may be schematized in the nontruth-functional hypothetical-conditional form “For every A if A, then C”, where “A” includes (T'/T)*P, and “C” states the calculated prediction value of P', when temperature is incremented by ΔT from T to T'. The theory is universally quantified, and thus claims to be true for every execution of the experimental test. And for proponents of the theory, who are believers in the theory, the semantics of T, P, T' and P' are mutually contributing to the semantics of each other, a fact easily exhibited in this case, because the equation is monotonic, such that each variable can be expressed as a mathematical function of all the others by simple algebraic transformations.
“A” also includes the universally quantified test-design statements. These statements describe the experimental set up, the procedures for executing the test and initial conditions to be realized for execution of a test. They include description of the equipment used including the container, the heat source, the instrumentation used to measure the magnitudes of heat and pressure, and the units of measurement for the magnitudes involved, namely the pressure units in atmospheres and the temperature units in degrees Kelvin. And they describe the procedure for executing the repeatable experiment. This test-design language is also universally quantified and thus also contributes meaning components to the semantics of the variables P, T and T' in “A” for all interested scientists who accept the test design.
The procedure for performing the experiment must be executed as described in the test-design language, in order for the test to be valid. The procedure will include firstly measuring and recording the initial values of T and P. For example let T = 300°K and P = 1.0 atmospheres. Let the incremented measurement value be recorded as ΔT = 15°K, so that the measurement value for T' is made to be 315°K. The description of the execution of the procedure and the recorded magnitudes are expressed in particularly quantified test-design language for this particular test execution. The value of P' is then calculated.
The test outcome consists of measuring and recording the resulting observed incremented value for pressure. Let this outcome be represented by particularly quantified statement O using the same vocabulary as in the test design. But only the universally quantified test-design statements define the semantics of O, so that the test is independent of the theory. In this simple experiment one can simply denote the measured value for the resulting observed pressure by the variable O. The test execution would also likely be repeated to enable estimation of the range of measurement error in T, T', P and O, and the measurement error propagated into P' by calculation. A mean average of the measurement values from repeated executions would be calculated for each of these variables. Deviations from the mean are estimates of the amounts of measurement error, and statistical standard deviations can summarize the dispersion of measurement errors about the means.
The mean average of the test-outcome measurements for O is compared to the mean average of the predicted measurements for P' to determine the test outcome. If the values of P' and O are equivalent within their estimated ranges of measurement error, i.e., are sufficiently close to 1.050 atmospheres as to be within the measurement errors, then the theory is deemed not to have been falsified. After repetitions with more extreme incremented values with no falsifying outcome, the theory will likely be deemed sufficiently warranted empirically to be called a law, as it is called today.
4.17 Semantics of Empirical Testing
The ordinary semantics of empirical testing is as follows: In all scientific experiments including microphysical experiments, the relevant set of universal statements is dichotomously divided into a subset of universal statements that is presumed for testing and the remainder subset of universal statements that is proposed for testing. The division is pragmatic. The former subset is called test-design statements and the latter subset is called theory statements. The test-design statements are presumed true for the test. Consider a descriptive term that is a subject term in any one of the universal statements, and that is common to both the test-design statements and the theory statements in the divided list. The dual analytic-synthetic nature of all of the universal statements makes that common subject term have part of its semantics supplied by the concepts that are predicated of it in the test-design statements. This part of the common subject term’s semantics remains unchanged through the test, so long as the division between theory and test-design statements remains unchanged. The proponents and advocates of the theory presumably believe that the theory statements are true with enough conviction to warrant empirical testing, but their belief does not carry the same high degree of conviction that they have in the test-design statements.
Before the execution of a test of the theory, all interested scientists agree that the universally quantified test-design statements and also the particularly quantified language that will describe the test-outcome with semantics defined in the universally quantified test-design statements are true independently of the theory. If the test outcome shows an inconsistency between the characterization supplied by the test-outcome statements and the characterization made by theory’s prediction statements, the interested scientists agree than it is the theory that is to be viewed as falsified and not the universally quantified test-design statements. This independence of test-design and test-outcome statements is required for the test to be contingent, and it precludes the test-design statements from either implying or denying the theory to be tested or any alternative theory that addresses the same problem. Therefore for the cognizant scientific profession the semantical parts defined by the test-design statements before test execution make their terms effectively vague relative to the theory, because test-design statements are silent with respect to any of the theory’s claims. The originating proposer and supporting advocates of the theory may have such high confidence in their theory, that for them the theory may supply part of the semantics for its constituent terms even before testing, but they have nonetheless agreed that in the event of a falsifying test outcome the test-design language trumps the theory. The essential contingency in an empirical test requires that functionally the theory does not define any part of the semantics of its constituent terms that are common to the test design. Or in other words the test-design statements assumed the vague semantical status that Heisenberg called the physicist’s “everyday” concepts.
After the test is executed in accordance with its test design, the particularly quantified test-outcome statements and the theory’s particularly quantified prediction statements are either consistent or inconsistent with one another (after discounting empirical underdetermination not attributable to failure to execute the test in accordance with the agreed test design). In other words they either characterize the same observed or measurement instances or they do not. If the test outcome is an inconsistency between the test-outcome description and the theory’s prediction, then the theory is falsified. And since the theory is therefore no longer believed to be true, it cannot contribute to the semantics of any of its constituent descriptive terms even for the proposer and advocates of the theory. But if the test outcome is not a falsifying inconsistency between the theory’s prediction and the test-outcome description, then they identify the same instances, and for each term common to the theory and test design the semantics contributed by the both universally quantified test-design and theory statements are component parts of the univocal meaning complex of each shared descriptive term. The additional characterization supplied by the semantics of the tested and nonfalsified theory statements resolves the vagueness that the meaning of the common descriptive terms had before the test, especially for those who did not share the conviction had by the theory’s proposers and advocates.
In some sciences such as physics a theory’s domain may include the test-design domain for the theory. As stated above, before the test execution of such a theory and before the test outcome is known, the test-design language must be vague about the tested theory’s domain, in order for the test to be independent of the theory’s description. But if after the test the outcome is known to be nonfalsification of the tested theory, then the nonfalsified theory has become a law, and the domain of the test-design language may be describable with the language of the new law, possibly by logical derivation of the test-design laws from the tested and nonfalsified theory. This application of the tested and nonfalsified theory to its test domain changes the semantics of the test-design statements by still further resolving the vagueness in the test-design language.
In 1925 when rejecting positivism Einstein told Heisenberg that he must assume that the test design domain can be described by the theory. Einstein argued that it is in principle impossible to base any theory on observable magnitudes alone, because it is the theory that decides what the physicist can observe. Einstein argued that when the physicist claims to have observed something new, he is actually saying that while he is about to formulate a new theory that does not agree with the old one, he nevertheless must assume that the new theory functions in a sufficiently adequate way that he can rely upon it and can speak of observations. The claim to have introduced nothing but observable magnitudes is actually to have made an assumption about a property of the theory that the physicist is trying to formulate.
Einstein’s conversation with Heisenberg in 1925 about observation influenced Heisenberg’s views on quantum mechanics. Before the test outcome is known it is sufficient to use a vaguer or less precise vocabulary that Heisenberg calls “everyday” words used by physicists, in order to describe the experimental set up, which is a macrophysical phenomenon. The meanings of these “everyday” concepts are vague, because they are silent about the fundamental constitution of matter. After the test outcome is known, the tested and nonfalsified quantum theory is recognized as empirically adequate, and the vagueness in these everyday concepts is resolved by the equations constituting the quantum theory. The quantum mechanics became a semantical rule contributing meaning parts to the complex meanings of the univocal terms used to describe the experimental set up and observation. This effectively makes the meanings quantum concepts, whether or not quantum effects are empirically detectable or operative in the description of the macroscopic features of the experimental set up. It is sufficient merely that the scientist realize that the nonfalsifying test outcome has made quantum mechanics and not classical mechanics an empirically warranted microphysical theory for the quantum semantic values to be included in the univocal meaning complexes associated with the observation description. Thus Newtonian concepts were never included, because the macrophysical description never affirmed a Newtonian microphysical theory.
4.18 Test Design Revision
On the realistic neopragmatist philosophy all universally quantified statements are hypothetical, but theory statements are relatively more hypothetical than test-design statements, because the interested scientists agree that in the event of a falsifying test outcome, revision of the theory will likely be more productive than revision of the test design.
Consequently empirical tests are conclusive decision procedures only for scientists who agree on which language is proposed theory and which language is presumed test design, and who also accept both the test design and the test-execution outcomes produced with the accepted test design.
Therefore contrary to positivists and romantics the decidability of empirical testing is not absolute. Popper had recognized that the statements reporting the observed test outcome, which he called “basic statements”, require agreement by the cognizant scientists, and that these basic statements are subject to reconsideration.
A dissenting scientist who does not accept a falsifying test outcome of a theory has either rejected the report of the observed test outcome or reconsidered the test design. If he has rejected the outcome of the individual test execution, he has merely questioned whether or not the test was executed in compliance with its agreed test-design protocols. Independent repetition of the test with conscientious fidelity to the design may answer such a challenge to the test’s validity one way or the other.
But if in response to a falsifying test outcome the dissenting scientist has reconsidered the test design itself, he has thereby changed the semantics involved in the test in a fundamental way. Such reconsideration amounts to rejecting the design as if it was falsified, and letting the theory define the subject of the test and the problem under investigation – a rôle reversal in the pragmatics of test-design language and theory language that makes the original test design and the falsifying test execution irrelevant.
In his “Truth, Rationality, and the Growth of Knowledge” (1961) reprinted in Conjectures and Refutations (1963) Popper rejects such a dissenting response to a test, calling it a “content-decreasing stratagem”. He admonishes that the fundamental maxim of every critical discussion is that one should “stick to the problem”. But as James B. Conant (1873-1978) recognized to his dismay in his On Understanding Science: An Historical Approach (1947) the history of science has shown that such prejudicial responses to scientific evidence that have nevertheless been productive and strategic to the advancement of basic science in historically important episodes. The prejudicially dissenting scientists may decide that the design for the falsifying test supplied an inadequate description of the problem that the tested theory is intended to solve, often if he developed the theory himself and did not develop the test design. The semantical change produced for such a recalcitrant believer in the theory affects the meanings of the terms common to the theory and test-design statements. The parts of the meaning complex that had been contributed by the rejected test-design statements are the parts that are excluded from the semantics of one or several of the descriptive terms common to the theory and test-design statements. Such a semantical outcome can indeed be said to be “content-decreasing”, as Popper said.
But a scientist’s prejudiced or “tenacious” (per Feyerabend) rejection of an apparently falsifying test outcome may have a contributing function in the development of science. It may function as what Feyerabend called a “detecting device”, a practice he called “counterinduction”, which is a strategy that he illustrated in his examination of Galileo’s arguments for the Copernican cosmology. Galileo used the apparently falsified heliocentric theory as a “detecting device” by letting his prejudicial belief in the heliocentric theory control the semantics of the apparently falsifying observational description. As Feyerabend showed, this enabled Galileo to reinterpret observations previously described with the equally prejudiced alternative semantics built into the Aristotelian geocentric cosmology.
Counterinduction was also the strategy used by Heisenberg, when he reinterpreted the observational description of the electron track in the Wilson cloud chamber using Einstein’s aphorism that the theory decides what the physicist can observe, and Heisenberg reports that as a result he then developed his indeterminacy relations using his matrix-mechanics quantum concepts.
Another historic example of using an apparently falsified theory as a detecting device involves the discovery of the planet Neptune. In 1821, when Uranus happened to pass Neptune in its orbit – an alignment that had not occurred since 1649 and was not to occur again until 1993 – Alexis Bouvard (1767-1843) developed calculations locating the positions of the planet Uranus using Newton’s celestial mechanics. But observations of Uranus showed significant deviations from the calculated positions.
A first possible response would have been to dismiss the deviations as measurement errors and preserve belief in Newton’s celestial mechanics. But the astronomical measurements were at that time repeatable, and the deviations were large enough that they were not dismissed as observational errors. The deviations were recognized to have presented a new problem.
A second possible response would have been to give Newton’s celestial mechanics the hypothetical status of a theory, to view Newton’s law of gravitation as falsified by the anomalous observations of Uranus, and then to attempt to revise Newtonian celestial mechanics. But by then confidence in Newtonian celestial mechanics was very high, and no alternative to Newton’s physics had yet been proposed. Therefore there was great reluctance to reject Newtonian physics.
A third possible response, which was historically taken, was to preserve belief in the Newtonian celestial mechanics, to modify the test-design language by proposing a new auxiliary hypothesis of a gravitationally disturbing planet, and then to reinterpret the observations by supplementing the description of the deviations using the auxiliary hypothesis. Disturbing phenomena can “contaminate” even supposedly controlled laboratory experiments. The auxiliary hypothesis changed the semantics of the test-design description with respect to what was observed; it added new semantic values and structure to the semantics of the test design.
In 1845 both John Couch Adams (1819-1892) in England and Urbain Le Verrier (1811-1877) in France independently using apparently falsified Newtonian physics as what Feyerabend called a “detecting device” made calculations of the positions of a disturbing postulated planet to guide future observations in order to detect the postulated disturbing body by telescope. On 23 September 1846 using Le Verrier’s calculations Johann Galle (1812-1910) observed the postulated planet with the telescope of the Royal Observatory in Berlin.
Theory is language proposed for testing, and test design is language presumed for testing. But here the pragmatics of the discourses was reversed. In this third response the Newtonian gravitation law was not deemed a tested and falsified theory, but rather was presumed to be true and used for a new test design. The modified test-design language was given the relatively more hypothetical status of theory by the auxiliary hypothesis of the postulated planet thus newly characterizing the observed deviations in the positions of Uranus. The nonfalsifying test outcome of this new hypothesis was Galle’s observational detection of the postulated planet, which was named Neptune. Generalizing on this discovery offers an example of the theory-elaboration discovery technique with the modified version of the original test design functioning as a new theory.
But counterinduction is after all just a strategy, and it is more an exceptional practice than the routine one. Le Verrier’s counterinduction strategy failed to explain a deviant motion of the planet Mercury when its orbit comes closest to the sun, a deviation known as its perihelion precession. In 1843 Le Verrier presumed to postulate a gravitationally disturbing planet that he named Vulcan and predicted its orbital positions. However unlike Le Verrier, Einstein had given Newton’s celestial mechanics the more hypothetical status of theory language, and he viewed Newton’s law of gravitation as having been falsified by the anomalous perihelion precession. He had initially attempted a revision of Newtonian celestial mechanics by generalizing on his special theory of relativity. This first such attempt is known as his Entwurf version, which he developed in 1913 in collaboration with his mathematician friend Marcel Grossman. But working in collaboration with his friend Michele Besso he found that the Entwurf version had clearly failed to account accurately for Mercury’s orbital deviations; it showed only 18 seconds of arc per century instead of the actual 43 seconds.
In 1915 he finally abandoned the Entwurf version, and under prodding from the mathematician David Hilbert (1862-1943) he turned to mathematics exclusively to produce his general theory of relativity. He then developed his general theory, and announced his correct prediction of the deviations in Mercury’s orbit to the Prussian Academy of Sciences on 18 November 1915. He received a congratulating letter from Hilbert on “conquering” the perihelion motion of Mercury. After years of delay due to World War I his general theory was further vindicated by Arthur Eddington’s (1888-1944) historic eclipse test of 1919. Some astronomers reported that they had observed a transit of a planet across the sun’s disk, but these claims were found to be spurious when larger telescopes were used, and Le Verrier’s postulated planet Vulcan has never been observed. MIT professor Thomas Levenson (1958) relates the history of the futile search for Vulcan in his The Hunt for Vulcan (2015).
Le Verrier’s response to Uranus’ deviant orbital observations was the opposite to Einstein’s response to the deviant orbital observations of Mercury. Le Verrier reversed the rôles of theory and test-design language by preserving his belief in Newton’s physics and using it to revise the test-design language with his postulate of a disturbing planet. Einstein viewed Newton’s celestial mechanics to be hypothetical, because he believed that the Newtonian theory statements were more likely to be productively revised than the test-design statements, and he took the anomalous orbital observations of Mercury to falsify Newton’s physics, thus indicating that theory revision was needed. Empirical tests are conclusive decision procedures only for scientists who agree on which language is proposed theory and which is presumed test design, and who furthermore accept both the test design and the test-execution outcomes produced with the accepted test design.
For more about Feyerabend on counterinduction readers are referred to BOOK VI at the free web site www.philsci.com or in the e-book Twentieth-Century Philosophy of Science: A History, which is available at Internet booksellers through hyperlinks in the web site.
There are also more routine cases of test design revision that do not occasion counterinduction. In such cases there is no rôle reversal in the pragmatics of theory and test design, but there may be an equivocating revision in the test-design semantics depending on the test outcome due to a new observational technique or instrumentality, which may have originated in what Feyerabend called “auxiliary sciences”, e.g., development of a superior microscope or telescope. If retesting a previously nonfalsified theory with the new test design with the new observational technique or instrumentality does not produce a falsifying outcome, then the result is merely a refinement that has reduced the empirical underdetermination manifest as vagueness in the semantics of the test-design language (See below, Section 4.19). But if the newly accepted test design occasions a falsification, then it has produced a semantical equivocation between the statements of the old and new test-designs, and has thereby technically redefined the subject of the tested theory.
4.19 Empirical Underdetermination
Conceptual vagueness and measurement error are manifestations of empirical underdetermination, which may occasion scientific pluralism.
Two manifestations of empirical underdetermination are conceptual vagueness and measurement error. All concepts have vagueness that can be reduced indefinitely but can never be eliminated completely. This is even true of concepts of quantized objects. Mathematically expressed theories use measurement data that always contain measurement inaccuracy that can be reduced indefinitely but never eliminated completely. The empirical underdetermination of language may make an empirical test design incapable of producing a decisive theory-testing outcome.
Scientists prefer measurements and mathematically expressed theories, because they can measure the amount of prediction error in the theory, when the theory is tested. But separating measurement error from a theory’s prediction error can be problematic. Repeated careful execution of the measurement procedure, if the test is repeatable, enables statistical estimation of the range of measurement error. But in research using historical time-series data such as in econometrics, repetition is not typically possible.
4.20 Scientific Pluralism
Scientific pluralism is recognition of the co-existence of multiple empirically adequate alternative explanations due to undecidability resulting from the empirical underdetermination in a test-design.
All language is always empirically underdetermined by reality. Empirical underdetermination explains how two or more semantically alternative empirically adequate explanations can have the same test-design. This means that there are several theories having alternative explanatory factors and yielding accurate predictions that are alternatives to one another, while having differences that are small enough to be within the range of the estimated measurement error in the test design. In such cases empirical underdetermination in the current test design imposes undecidability on the choice among the alternative explanations.
Econometricians are accustomed to alternative empirically adequate econometric models. This occurs because measurement errors in aggregate social statistics are often large in comparison to those incurred in laboratory sciences. In such cases each social-science model has different equation specifications, i.e., different causal variables in the equations of the model, and makes different predictions for some of the same prediction variables that are accurate within the relatively large range of estimated measurement error. And discovery systems with empirical test procedures routinely proliferate empirically adequate alternative explanations for output. They produce what Einstein called “an embarrassment of riches”. Logically this multiplicity of alternative explanations means that there may be alternative empirically warranted nontruth-functional hypothetical-conditional deep structure in the form “For all A if A, then C” having alternative causal antecedents “A” and making different but empirically adequate predictions that are the empirically indistinguishable consequents “C”.
Empirical underdetermination is also manifested as conceptual vagueness. For example to develop his three laws of planetary motion Johannes Kepler (1591-1630), a heliocentrist, used the measurement observations of Mars that had been collected by Tycho Brahe (1546-1601), a type of geocentrist. Brahe had an awkward geocentric-heliocentric cosmology, in which the fixed earth is the center of the universe, the stars and the sun revolve around the earth, and the other planets revolve around the sun. Kepler used Brahe’s astronomical measurement data. There was empirical underdetermination in these measurement data, as in all measurement data.
Kepler was a convinced Copernican placing the sun at the center of the universe. His belief in the Copernican heliocentric cosmology made the semantic parts contributed by that heliocentric cosmology become for him component parts of the semantics of the language used for celestial observation, thus displacing Brahe’s more complicated combined geocentric-heliocentric cosmology’s semantical contribution. The manner in which Brahe and Kepler could have different observations is discussed by Hanson in his chapter “Observation” in his Patterns of Discovery. Hanson states that even if both the geocentric and heliocentric astronomers saw the same dawn, they nonetheless saw differently. Thus Brahe sees that the sun is beginning its journey from horizon to horizon, while Kepler sees that the earth’s horizon is dipping away from our fixed local star. Einstein said that the theory decides what the physicist can observe; Hanson similarly said that observation is “theory laden”.
Alternative empirically adequate explanations due to empirical underdetermination are all more or less true. An answer as to which explanation is truer must await further development of additional observational information or more accurate measurements that reduce the empirical underdetermination in the test-design concepts. But there is never any ideal test design with “complete” information, i.e., without vagueness or measurement error. Recognition of possible undecidability among alternative empirically adequate scientific explanations due to empirical underdetermination occasions what realistic neopragmatists call “scientific pluralism”.
4.21 Scientific Truth
Truth and falsehood are spectrum properties of statements, such that the greater the truth, the lesser the error.
Tested and nonfalsified statements are more empirically adequate, have more realistic ontologies, and are truer than falsified statements.
Falsified statements have recognized error, and may simply be rejected, unless they are found still to be useful for their lesser realism and truth.
The degree of truth in untested statements is unknown until tested.
What is truth! Truth is a spectrum property of descriptive language with its perspectivist relativized semantics and ontology. It is not merely a subjective expression of approval.
As Jarrett Leplin (1944) maintains in his Defense of Scientific Realism (1997), truth and falsehood are properties of statements that admit to more or less. They are not simply dichotomous, as they are represented in two-valued formal logic. Belief and truth are not identical. Belief is acceptance of a statement as predominantly true. Therefore one may wrongly believe that a predominantly false statement is predominantly true, or wrongly believe that a predominantly true statement is predominantly false. Belief controls the semantics of the descriptive terms in a universally quantified statement, while truth is the relation of a statement’s semantics together with the ontology it describes to mind-independent nonlinguistic reality.
Test-design language is presumed true with definitional force for its semantics, in order to characterize the subject and procedures of a test. Theory language in an empirical test may be believed true by the developer and advocates of the theory, but the theory is not true simply by virtue of their belief. Belief in an untested theory is speculation about a future test outcome. A nonfalsifying test outcome will warrant belief that the tested theory is as true as the theory’s demonstrated empirical adequacy. Empirically falsified theories have recognized error, are predominantly false, and may be rejected unless they are found still to be useful for their lesser realism and lesser truth. Tested and nonfalsified statements are more empirically adequate, have ontologies that are more realistic, and thus are truer than empirically falsified statements.
Popper said that Eddington’s historic eclipse test of Einstein’s theory of gravitation in 1919 “falsified” Newton’s theory and thus “corroborated” Einstein’s theory. Yet the U.S. National Aeronautics and Space Administration (NASA) today still uses Newton’s laws to navigate interplanetary rocket flights such as the Voyager and New Horizon missions. Thus Newton’s “falsified” theory is not completely false or totally unrealistic, or it could never have been used before or after Einstein. Popper said that science does not attain truth. But contemporary realistic neopragmatists believe that such an absolutist idea of truth beyond the reach of basic science is misconceived. Advancement in empirical adequacy is advancement in realism and in truth. Feyerabend said, “Anything goes”. Regarding ontology Hickey says, “Everything goes”, because while not all discourses are equally valid, there is no semantically interpreted syntax utterly devoid of ontological significance and thus no discourse utterly devoid of truth. Therefore Hickey adds that the more empirically adequate explanation goes farther – is truer and more realistic – than its less empirically adequate falsified alternatives.
In the latter half of the twentieth century there was a melodramatic melee among academic philosophers of science called the “Science Wars”. The phrase “Science Wars” appeared in the journal Social Text published by Duke University in 1996. The issue contained a bogus article by a New York University physicist Alan Sokal. In a New York Times article (18 May 1996) Sokal disclosed that his purpose was to flatter the editors’ ideological preconceptions, which were social constructionist. Sokal’s paper was intended to be a debunking exposé of postmodernism. But since the article was written as a parody instead of a serious scholarly article, it was basically an embarrassment for the editors. The “Science Wars” conflict involved sociology of science due to the influence of Thomas Kuhn’s Structure of Scientific Revolutions. On the one side of the conflict were the postmodernists who advocated semantical relativism and constructivism. On the other side were traditionalist philosophers who defended scientific realism and objectivism. The postmodernists questioned the decidability of scientific criticism, while the traditionalists defended it in the name of reason in the practice of science.
The “Science Wars” pseudo conflict is resolved by the introduction of the ideas of relativized componential semantics and ontological relativity, which are both realistic and constructivist, and are also decidable by empirical criticism. Relativized semantics is perspectivist and it relativizes ontology that is revealed reality. Empirical underdetermination limits the decidability of criticism and occasionally admits scientific pluralism within empirically set limits. But perspectivist relativized semantics and constructivist discovery do not abrogate decidability in scientific criticism or preclude scientific progress; it does not deliver science to social capriciousness or to any inherent irrationality.
As discovery and empirical criticism increase empirical adequacy in science, they thereby increase realism and truth.