Introduction
Watchmakers, Doctors, and Scientists: The Quine/Duhem Problem
Pierre Duhem and Willard Quine raise a problem for what is traditionally known as the «hypothetico-deductive model» of scientific hypothesis testing. On this traditional model, a scientific hypothesis is tested by deducing an observable consequence of the hypothesis, and then empirically observing whether this consequence actually is the case. That is,
1) H → e
2) Not e
3) Therefore, Not H
On this model of scientific testing, a logical consequence e is derived from the hypothesis H, and then e is observed. In the event that e turns out to not be the case, then on this model, the hypothesis is shown to be false. The relevant rule of inference here is modus tollens. Here's a rough example of the kind of reasoning involved. Assume that I am a scientist and my hypothesis H is that drinking coffee causes cancer.Foot note 9_1 A logical consequence of my hypothesis is that, when a group of people with similar health histories and habits are divided into the coffee drinkers and non-coffee drinkers, there will be a significantly higher occurrence of cancer in the coffee drinkers than the non-coffee drinkers. The e in this case is the claim that there will be a significantly greater occurrence of cancer in the coffee drinkers than in the non-coffee drinkers. Suppose then I do the experiment and discover no significant difference. On the traditional hypothetico-deductive model of scientific testing, this evidence proves conclusively that the hypothesis that coffee causes cancer is false. If, though, a significant difference is shown, this does not conclusively prove the truth of the hypothesis. It only shows that the hypothesis has passed one test. This is because, strictly speaking, there could be some other factor that brings about the greater occurrence of cancer than the drinking of coffee.
Pierre Duhem made the following critique of this model of scientific testing:
Duhem's problem is with the first part of the model, (1). His claim is that no hypothesis can be separated from an indefinite set of auxiliary hypotheses. In our coffee example, an auxilary premise might be that the test was done on a group of people with similar health histories, or that there was no mistake in the counting of the instances, etc. Taking into consideration this indefinite set of auxiliary premises, we then have:
1) {H, (A1, A2, A3, ... , An)} → e
2) Not e
3) Not {H, (A1, A2, A3, ... , An)}
If Duhem is right, then all that the conflicting result shows is that one of the set of the main and auxiliary hypotheses is mistaken. What the result does not show any longer is Not H. In our example, the lack of a significant difference is no longer conclusive grounds for rejecting the hypotheses that coffee causes cancer.
Duhem sometimes explained his problem by comparing the scientist to a doctor, and contrasting the scientist with a watchmakerFoot note 9_2. A watchmaker, when faced with a watch that does not work, can look at each part of the watch in isolation, going from piece to piece, until the defect is detected. The doctor, on the other hand, cannot examine each of the parts of an ailing patient's body in isolation. Instead she must detect the seat of the illness only by inspecting the effects produced on the whole body. Similarly, the scientist cannot separate out each of an indefinite set of auxiliary premises to test each in isolation. From this, though, a paradox arises, one concerning how it is reasonable to «lay blame» on a main scientific hypothesis or one of its auxiliaries.
The Paradox
On its standard definition, a paradox is a deductive argument with seemingly true premises, employing apparently correct reasoning, with an obviously false or contradictory conclusion. Consider, for example, a version of the famous skeptical paradox:
1) I can know that I live in Brooklyn, only if I can know that I am not a brain in a vat.
2) I cannot know that I am not a brain in a vat.
3) Therefore, I cannot know that I live in Brooklyn.
This is an argument with seemingly true premises, employing apparently correct reasoning, but with what looks like an obviously false conclusion. Premise one merely states that it is a precondition for my knowing that I live in Brooklyn that I know a more fundamental truth, namely that I am not a brain in a vat. If I were a brain in a vat, then I would not necessarily be in Brooklyn. Premise two is the claim that I cannot know that I am not a brain in a vat. No evidence I could find would count as completely convincing evidence for the hypothesis that I am not a brain in a vat, because it is possible that the evil scientist who has been keeping me in this vat has produced in me the experience of getting this evidence. Notice that this is not the claim that my being a brain in a vat is likely, but rather a claim about the remote possibility that this is so. In addition, the reasoning involved in the skeptical paradox is straightforward. The argument has the form:
1) P, only if Q
2) Not Q
3) Therefore, Not P
However, the conclusion is implausible. The paradox could be, and often is, phrased as precluding all knowledge, even of the most obvious truths. Since the premises are seemingly true, the reasoning is straightforward, and the conclusion seems obviously false, the skeptical paradox meets each of the requirements for being a philosophical paradox.
Now consider the following simplified form of an argument I will flesh out below:
1) No hypothesis can be tested in isolation from an indefinite set of auxiliary hypotheses.
2) In order to show that a hypothesis is mistaken, it is necessary to isolate that hypothesis from its set of auxiliary hypotheses.
3) Therefore, no hypothesis can be shown to be mistaken.
I will call the above argument the «simple Quine/Duhem paradox.» The first premise of the argument is the claim that whenever a test of a hypothesis is made there is an indefinite set of auxiliary hypotheses that must go along with the hypothesis. For another example, imagine an experiment is designed to test the hypothesis that the earth is a cube by observing the shadow it casts during an eclipse. The hypothesis is that the earth is a cube, and an entailment of this is that the earth will leave a square or diamond-shaped shadow. But the earth's shadow is an entailment only if certain other preconditions are met. For example, the experimenter assumes that: the light will not be such that it turns the shadows of cubes into circles; the instruments used to identify the shadow are functioning properly; we're all not brains in vats; etc. The first premise implies that the cube hypothesis, in order to be tested, must accompany these and a potentially infinite set of other hypotheses.
The second premise is a statement of the moral of the Quine/Duhem problem:
1) {H, (A1, A2,...,An)} -> e
2) Not e
3) ~{H, (A1, A2,...,An)}
The above argument is not a paradox in itself, but rather an illustration of the kind of deductive reasoning available for «laying blame» on the hypothesis. As the argument states, the negative experimental result (2) only shows that there is something wrong with the set of hypotheses (H, A1, A2,...,An) and not necessarily with H itself. It is this type of argument that licenses premise two, the statement that there must be some way of isolating H if one is going to be able to show it mistaken. Premise two implies that for the cube hypothesis to be shown mistaken, the hypothesis must be separated from its auxiliary premises. Assume that the cube-shaped earth experiment is performed and the shadow is circular. In this case, the circular shadow only shows that one of the set of hypotheses that includes H and an indefinite set of auxiliary hypotheses is mistaken. What it does not show is that H is mistaken. The conclusion of the simple form of the paradox is that no hypothesis can be shown to be mistaken. In the case of the cube hypothesis, this hypothesis cannot be shown to be mistaken either.
Using the cube hypothesis, we have the following version of the paradox:
1) The hypothesis that the earth is a cube cannot be tested without being conjoined to an indefinitely large set of auxiliary hypotheses.
2) If the hypothesis that the earth is a cube cannot be tested without being conjoined to an indefinitely large set of auxiliary hypotheses, then the hypothesis that the earth is a cube cannot be shown to be mistaken.
3) The hypothesis that the earth is a cube cannot be shown to be mistaken.
The conclusion is intuitively implausible given the obvious falsity of the hypothesis. Moreover, in the simple form of the paradox, the conclusion is that no hypothesis can be shown to be mistaken. This is even more implausible.
The last thing to consider in respect to my showing that this is a genuine paradox is the reasoning employed. In both the cube version and the simple form of the paradox the reasoning is straightforward. In the cube version we have a standard use of modus ponens. In the simple version, we have an argument of the form: No S is P; All Q are P; Therefore, No S is Q. Nothing is out of line here. Thus, the Quine/Duhem Paradox meets all the requirements of a standard philosophical paradox: it has apparently true premises; its conclusion is highly implausible; and the reasoning involved is straightforward.
The Quine/Duhem problem, phrased as a paradox, is really a special case of the skeptical paradox discussed earlier. The skeptical paradox claims that the incredibly strong precondition for my knowing p cannot be attained, and hence I cannot know p. For the Q/D Paradox, the set of auxiliary premises has the same function as the precondition in the skeptical paradox: because they cannot be ruled out, they keep us from knowing the status of the hypothesis. The conclusion of the skeptical paradox is that we cannot know some obvious hypothesis (we can't prove H), while for the Q/D paradox, the conclusion is that no hypothesis can be shown to be mistaken (we can't prove Not H).
The many solutions to the Quine/Duhem problem can be seen as solutions to the Quine/Duhem paradox. For brevity's sake, I am going to restrict my discussion to the most famous and/or the most plausible solutions. One famous response to the Quine/Duhem problem is given by the popular Kuhnian approach. On this account the moral of the problem raised by Quine and Duhem is that it is only whole theories, systems, or what Kuhn terms «paradigms» that are shown to be mistaken. The Kuhnian account, in other words claims that the conclusion of the simple Quine/Duhem paradox is true, and that no hypothesis can be conclusively shown to be the one at fault. Only whole theories broadly construed, or paradigms, are open for rejection. And a major factor in the choice of one paradigm over another is the opinion of the scientific community itself. In The Structure of Scientific Revolutions, Kuhn maintains that:
Only whole paradigms get accepted or rejected based upon the assent of the scientific community. We mistakenly think that the conclusion «No hypothesis can be shown to be mistaken» is false, when in fact it is true. Only whole theories can be rejected in the way we normally think that hypotheses are rejected.
There are reasons for avoiding this kind of approach. The first is that the response is too drastic. Consider the cube-shaped earth hypothesis again. Does it really make sense to say that such an obviously mistaken hypothesis cannot be shown false? Commonsense and the obvious falsity of the hypothesis suggest that a solution to the paradox should be looked for elsewhere. Another reason for looking elsewhere is that the successful account of this problem is going to give some kind of analysis of the concepts involved. The «whole systems» theorist, by taking the stance that it is only whole theories that are capable of being rejected, does not address the issue of what happens to the particular hypothesis, except perhaps by saying that as a result of the paradigm shift, it would be accepted or reject. So the approach switches the issue from the evaluation of hypotheses to the evaluation of theories. More generally, a criticism often leveled against the Kuhnian approach is that it reduces theory-choice to a kind of «mob psychology,» where what counts as successful is determined by sociological and not logical or evidential factors.
Bayesianism
Another place to look for an account of evidential appraisal is statistics. In fact, a group of philosophers of statistics claim to have a solution to the Quine/Duhem problem. These philosophers, known as «Bayesians,» get their name from the statistician Thomas Bayes. According to the Bayesians, an answer to the Quine/Duhem problem can be given if the hypothetico-deductive model of scientific testing is replaced with another model. On the Bayesian model, evidence e confirms a hypothesis H to the extent that a scientist's degree of belief in H is higher given evidence e than what it was or would be without this evidence. Probability is a measure of the subjective degree of belief ranging from «0,» complete disbelief, to «1,» complete certainty. The scientist's degree of belief in the hypothesis without the evidence is called the «prior probability» of the hypothesis, and the scientist's degree of belief in the hypothesis after the evidence is called the «posterior probability.» So if the posterior probability of H is greater than the prior probability of H, the extent to which H is confirmed is the difference between the posterior and prior probabilities. To figure out the posterior probability of H, Bayesians use a version of Bayes' Theorem:
P (H | e) = P (e| H) P (H)
P (e | H) P (H) + P (e | not- H) P (not- H)
This is read, «The probability of H, given e, is equal to the probability of e given H times the probability of H, over the probability of e given H, times the probability of H, plus the probability of e given not H times the probability of not H.» Certain factors need to be known before the «laying of blame» can take place: (i) the prior probabilities in H and not-H; (ii) the likelihood, which is P(e | H); and (iii) what's called the Bayesian «catchall factor,» which is P (e | not-H). Once you have these, then you just plug them into Bayes' theorem to get the posterior probability.
Here's an example: Consider a situation in which an experiment is done and the result seems to contradict the hypothesis in question. One response to this negative result would be to reject the hypothesis. Think back to the coffee example. Another response would be to look at the auxiliary premises. Suppose for simplicity's sake that there is only one auxiliary premise, that the subjects in the study have similar health histories. In this case, the main hypothesis H is that coffee causes cancer, and the auxiliary hypothesis A is that a test of the hypothesis will involve people who have similar health histories. In this simplified example, hypothesis H and auxiliary A entail e, a significant difference in cancer occurrence, but not-e is observed. The Bayesian account shows when A is more likely to be blamed than H, or vice versa. Assume that there is a great deal of evidence for H, that there is hardly more evidence for the truth of A than there is evidence for A's falsity. In our example, perhaps the health histories of the subjects weren't checked very thoroughly so it is possible that the coffee drinkers also smoke, while the non-drinkers don't. This is a situation in which it is more likely that A is the where the problem lies, rather than H. Bayesians solve this type of problem by plugging in values into Bayes' theorem. (i) First, they assign a lower prior probability to A than to H. For example, A, being only slightly more probable than Not-A, it would have a prior probability of around .6. H, on the other hand, would have a very high probability, say .9. (ii) With regards to the likelihood, the Bayesians assign a far greater likelihood for the negative result to happen when not-A is true rather than not-H. For example, P(not-e | a) = x, and P (not-e | not-a and H) =50x, and P(not-e | not-a and not-H) = 50x. I'll spare you the calculationsFoot note 9_3. The result is that after plugging these numbers into Bayes' theorem, the probability of H is only slightly decreased, going from .9 to .897, whereas the probability of A plummets from .6 to .003. So, given their prior probabilities, and the likelihood of getting the not-e result when either A or H is not true, it follows that the scientist has good reason to reject A, while still preserving H.
In a nutshell, the Bayesians, by analyzing the subjective degrees of belief of the scientists, and plugging these probabilities into Bayes' theorem, attempt to give an account of when the rejection of the main hypothesis is warranted, and when, instead an auxiliary premise is what must be rejected. The Bayesians solve the Quine/Duhem paradox by rejecting the second premise, the one which claims that in order to show that a hypothesis is mistaken, it is necessary to isolate that hypothesis from its set of auxiliary hypotheses. As long as we know the prior beliefs in the auxiliary premises then we can determine whether they should be rejected.
Although this is the standard view of scientific reasoning about Duhemian problems in the philosophy of statistics, the Bayesian account has certain problems. The most important objection concerns the Bayesians' reliance on the prior degree of belief of the scientist in his or her hypothesis before the hypothesis is tested. Assuming that scientists have such degrees of belief, and also assuming that these beliefs can be quantified into degrees, it is undesirable that the prior beliefs be taken as central to reasoning in science. Such subjective beliefs are highly variable, changing not only from person to person, but in addition, in the same person from moment to moment. For example, if a scientist's belief varies even slightly during a day, the justification for the acceptance or rejection of a hypothesis will be altered. Something seems not right with this subjectivist account. The problem, I suspect, lies in the conflation of a scientist's confidence in his or hypothesis with the evidence that it is true. As Deborah Mayo asks in the title of an article critiquing the Bayesian approach, «What's Belief Got to Do with It?» Mayo claims that:
Here Mayo is critiquing the «white glove» treatment given by the Bayesians as to how to solve such problems. To determine when an anomolous result requires that the main hypothesis or some auxiliary hypothesis to be rejected requires more than the subjective degrees of belief of the scientist prior to the experiment. What is required is evidence that the auxiliary is the faulty assumption, and an account of why this evidence isn't mistakenly taken as evidence that the auxiliary is to blame.
Another Approach: Error Statistics
So far we have discussed two approaches to the Quine/Duhem problem, the Kuhnian and the Bayesian account. Both approaches have serious problems. Like the Bayesian account, the solution I'll argue for rejects the hypothetico-deductive model of explanation and instead gives a probabilistic account of scientific reasoning. This approach takes as its starting point something Deborah Mayo in Error and the Growth of Experimental Knowledge calls, «error statistics,» which includes everyday concepts in statistics like significance tests and confidence intervals.
On the error statistical account of hypothesis testing:
Data e produced by procedure ET provides good evidence for hypothesis H to the extent that test ET severely passes H with e.
And:
H's passing test ET (with result e) is a severe test of H just to the extent that there is a very low probability that test procedure ET would yield such a passing result, if hypothesis H is false.
Instead of focusing on the probability of the hypothesis itself, the error statistician focuses on the reliability of the test of the hypothesis, particularly the probability that the test would pass H if H were false. If there is a low probability that a test would pass a hypothesis when that hypothesis is false, then passing that test is a good indication of the truth of the hypothesis.
To criticize attempts to explain away anomolous results, the error statistician employs «blocker strategies.» These strategies criticize such attempts on the grounds that (a) such explanations fail to pass severe tests; or, worse, (b) their denials pass severe tests. In addition, an anomolous test result may be legitimately blamed on an auxiliary hypothesis A by showing that Not A passes a severe test.
Consider, again, the example of the hypothesis about coffee and cancer, and the anomolous result that there was no significant difference between the coffee drinkers and non-coffee drinkers. In order to successfully blame the anomolous result on the auxiliary hypothesis that the subjects in the experiment have the same health histories and habits, the scientist would have to severely test the denial of the auxiliary premise. In this case, a severe test must pass the hypothesis that the two groups don't have similar health histories or habits. If there were a strong probability that the two groups were radically different in health habits, then the auxiliary premise should be blamed and not the hypothesis.
Getting Back to the Quine/Duhem Paradox
Mayo seems to claim that the error statistical approach solves the Quine/Duhem problem. This claim, I fear, is not completely correct. The moral of the Quine/ Duhem problem is that there is no complete guarantee of when the main hypothesis is to be rejected or when an auxiliary is at fault. And even on Mayo's model of hypothesis testing there is no such guarantee. As is the case with all statistical accounts, there remains the (admittedly remote) possibility that a false hypothesis passes a severe test. This is admittedly not likely, but all that is needed to run into trouble with the Quine/Duhem problem is a non-zero probability that a hypothesis could be passed by a severe test. This should not be a surprise, given that the Quine/Duhem Paradox is a special case of the skeptical paradox.
Although it was not intended this way, the error statistical account can be turned into a more restricted solution to the paradox. On this solution, the paradox arises because of a faulty conception about evidence. The assumed feature of evidence which leads to paradox is its conclusiveness, that is, that the evidence provides complete proof of the falsity of an empirical hypothesis. No conclusive grounds can be given to reject any empirical hypothesis. What can be given, though, is overwhelming good reason to do so. In fact, in the case of some evidence, it would be irrational not to accept this evidence as cause for the rejection of an hypothesis. Notice that this is a restricted solution because the claim is that there can be no new notion of evidence appraisal that provides certain grounds for rejecting an empirical hypothesis. The claim is that an alternative notion of evidential appraisal should be given, one that gives up the idea of complete proof and instead provides indefinitely high probabilities about the truth or falsity of a claim.
Here is a summation of my response to the Quine/Duhem paradox. Unlike the Bayesians and (I believe) error statisticians, I claim that there can be no thoroughgoing solution to the paradox, because to give one would involve showing, with Cartesian certainty, where to lay blame when an experimental result conflicts with a scientific hypothesis. This, I submit, cannot be done. What can be given is an alternative version of what counts as good evidence. And it is here where the error statistical approach is most useful. Moreover, such an account is all that is really needed as an account of where it is best to lay blame.
Application to the Skeptical Paradox
Earlier, I claimed that the Quine/Duhem Paradox is a special case of the more famous skeptical paradox. If this is so then the solution to one paradox will provide a solution to the other. Applying the solution to the skeptical paradox involves giving up the Cartesian ideal of certainty, at least with respect to empirical truths. Knowledge of the truth of the proposition «I live in Brooklyn,» if that is to mean certain knowledge, cannot be attained. What can be attained is overwhelming proof to that effect. A more informal application of error statistics is relevant here. On this account, we can derive overwhelming evidence of the truth or falsity of our beliefs by submitting these beliefs to severe tests as well.
For example, consider my belief that the blue patch on my hand yesterday was caused by touching a rail in the subway that had wet paint on it. In this case, there is a fairly reliable test of whether this in fact is so. I can go to the subway station and check to see if the rail is freshly painted, and if the color matches the color on my hands. The test is reliable because there is a low probability of there being fresh paint of exactly the color on my hands, while I acquired the paint stain elsewhere. Yes, it is in principle possible that I got the stain another way, perhaps there's a mailbox on my corner that's just been painted as well. But the probability of this is quite small. Notice the claim is not that there is conclusive proof of where I received the stain, but rather overwhelming proof. And this, generally, is all an ordinary notion of proof that is needed. So, like the solution to the Quine/Duhem paradox, the solution to the skeptical paradox is a restricted one. There is no successful account of knowledge that assumes that we can have certain knowledge of empirical truths. However, there can be overwhelming evidence of such truths. And this is all that is needed of an account of empirical knowledge.
Conclusion
In sum, the Quine/Duhem Paradox, and by extension, the skeptical paradox have a restricted solution, one the gives up the Cartesian ideal of complete certainty and conclusive proof. Our ordinary notions of proof and knowledge are problematic. However, alternative notions to evidential appraisal and knowledge can be given. These more restricted notions do all that is needed of our ordinary notions.
Appendix: Explanation of Calculations for the Bayesian Solution
It is given that A and H entail e, but not-e is observed. Also the probability of observing not-e, while A and H are true is 0. That is, P(not-e | A and H) = 0
We assigned H a very high probability, P (H) = .9, whereas we have assigned A a prior probability which makes it only slightly more likely than not. P (A) = .6. We also assumed that H and A are statistically independent. That is, the probability of H does not change the probability of A, or vice versa.
With regards to the assumed likelihoods, the probability of not-e being observed, given that A is true and not-H is assumed to be a very small number, x (for example, .001). That is, P(not-e | A and not-H) = x. On the other hand, we assumed the likelihood of not-e being observed given not-A being the case is assumed to be 50 times more likely, 50x. So: P(not-e | not-A and not-H) = 50x and P (not-e | not-A and H) = 50x.
We then plug these numbers into a simplified form of Bayes' Theorem:
P (H | not-e) = P (not-e | H) P (H)
P (not-e)
P (not-e) = P (not-e | H) P (H) + P (not-e | not-H) P (not-H)
P (not-e | H) = P (not-e | A and H) P (A) + P (not-e | not-A and H) P (not-A)
= 0 + 50x (.4)
= 20.6x
P(not-e) = 20x (.9) + 2.06x = 20.06x
For the posterior probability of H,
P (H | not-e) = 20x (.9)
______ = .897
20.06x
For the posterior probability of A,
P(A | not-e) = P(not-e | A) P (A)
P (not-e)
P (not-e | A) = P (not-e | A and H) P (H) + P (not-e | A and not-H) P(not-H)
= 0 + x (.1) = .1x
P (A | not-e) = .06x
______ = .003
20.06x
Whereas the probability of H is hardly changed, the probability of A plummets.
Works Cited