On The Evidence From Experiments: Part IV

This is a picture from the Wikipedia entry on Treatment. Ain’t it pretty?
Read Part I, II, III.

We are nearly at the point where we can say something about the candidate proposition of interest: “Treatment cures cancer of the albondigas.” It is obviously contingent, therefore it cannot be necessarily true. It may be contingently true or false, or have a non-extreme probability. This depends on which information we condition the proposition on.

First recognize the central limitation of the proposition. If it does cure, whose cancer is it curing? Well, the cancer of the Bobs. What about non-Bobs? We don’t know. So what was the purpose of the experiment? Just that: to learn whether the treatment cured Bobs. About not-Bobs we know nothing, unless we condition on evidence we did not see or on assumptions which are not proved.

This make strike you as odd since usually, or at least as it appears, experiments are meant to provide information about extensions to the experiment. We don’t do real drug trials to say things only about the patients in the trial, but about new possible patients. We don’t test material coming off a production line to say things only about the tested substances (which are commonly destroyed), but about the remainder of the output. And so on.

Repeat: speaking of extensions is not what we’re doing at the moment, though we will come to it. This is emphasized because one must keep these activities separate. For now, consider only the evidence of the experiment itself.

Let’s recall what this could be: (1) Every Bob is cured, (2) no Bob is cured, (3) every treatment Bob is cured and no placebo Bobs are, (4) no treatment Bob is cured and every placebo Bob is. After the experiment, we will have seen one of these outcomes, and when we do we invoke the conscious decision to add the outcome to our list of conditioning evidence. Remember that it is we who decide which evidence/premises to consider in any contingent proposition. Change the evidence/premises, change the probability of the proposition. This principle is key.

Outcome 1 What is the probability of “Treatment cures cancer of the albondigas” given the observation “Every Bob is cured” and given what we know about the experimental set up—all Bobs are identical, the measurement time, etc.? Part of this evidence is tacit: knowledge of English words and grammar, which can sometimes be ambiguous, but which is crucial in understanding how evidence relates to our proposition of interest.

Now “cures cancer” can mean that the treatment was the active, efficient cause, that somehow (it’s not important here how) the drug interacted with the body in some biochemical manner which eradicated the cancer. Or “cures” could mean the mere presence of the treatment was needed, but that the treatment itself did nothing active. To rework an example from the comments, suppose Bob went into a room and switched on light. Bob is the cause of the switch being thrown, which in turn caused the light to illuminate. This is different then if Bob threw the switch only because he saw Alice enter. Alice is not an active cause, but her presence was needed (we wouldn’t punish Alice for the switch throwing if, say, such a thing were illegal).

Was the presence of the treatment needed to cure the Bobs? Because all the Bobs are identical and all—both placebo and treatment Bobs—were cured, then if “cure” means “presence needed”, and since we assumed the presence of the placebo, we deduce our proposition is contingently false, or has probability 0. We don’t need the treatment: the placebo is enough. If we gave the placebo to the treatment Bobs we know, via our assumptions, they would have been cured, too. How we don’t know; but that they would be cured, we do.

What happens if we remove the evidence of presence of the placebo; that is, if the experiment only consisted of the treatment “arm”? Then all we can say is that the proposition is contingent, that it does not have an extreme probability, that its probability lies between 0 and 1. The observational evidence boils down to “The treatment either cured or it did not”; or put another way, “The Bobs got better on their own or the treatment ‘worked’.” Both sentences are tautologies, thus both are necessary truths, and hence provide no information to the proposition “Treatment cures cancer of the albondigas.” (Adding necessary truths to premises never changes the probability of a proposition.)

History assures there is more than one way to skin a cat. Perhaps there is more than one way to cure cancer of the albondigas. The placebo might have cured one way and the treatment another. That is, it could be the presence of the placebo was needed to cure the cancer, or the placebo was active in its cure, but in a different way than the treatment. (We only assume the placebo is not itself active, of course.) Perhaps the cancer would have been cured by eliminating biochemical A or B and placebos knocked out A, and the treatment B (or whatever).

Now if we assume the treatment interacted with the Bobs’ bodies differently than the placebos, for which there is no proof, then we open the possibility the treatment actively cured. But we must also assume that not only did the treatment interact differently, but that the interaction was such that it allowed the treatment to “do its thing.” And if we assume that, then we are arguing in a circle (with regards to our proposition).

We could go on with this imaginings indefinitely, but really we are left with nothing to say about the proposition, where cure means “active”, except that it is contingent, that is does not have a non-extreme probability, which is not saying much.


  1. Briggs


    I was middle school kid when Bigfoot was really Big. There were movies, TV shows, stuff in the paper. We lived in Northern Michigan and my parents used to make me go down a long hallway to the bedroom and practice my clarinet. The bedroom had big windows that opened to the woods where, of course, the hairy gentleman lived. I sat right in the doorway with my music stand blocking the passage. Turned out to be a successful avoidance strategy.

    Remind me to also tell you of the Ocqueoc Monster (he hated red).

  2. Francsois

    Briggs, you might not believe it, but many of the thinghs you mentioned above are well known to the clinical trialsit of today. They know that the RCT is far-from-perfect, but that currently it is likely top provide the best “guess” as the whether a treatment works or not. Again, in the real world decisions have to made, even on imperfect “evidence”.



  3. Briggs


    Oh? They know the RCT is the “best ‘guess'”? And how do they know that, pray? Don’t skimp on details in your answer.

    I might be wrong, but you appear anxious for a “formula”, a kind of “When the data is like this, do that” thing. Something where you don’t have to think too deeply. Well, you know where that kind of thing can lead. Stick around. Your patience, if you put in the effort, will be rewarded. Not everything there is to know about a new (to you) philosophy can be put in a few hundred words.

  4. Rich

    So far I’m reading this as, “When you rush to a conclusion these are the bits you miss. And they count”.

  5. Francois


    You say – “…you appear anxious for a “formula”, a kind of “When the data is like this, do that…”. Yes, that is exactly what I am looking for. Scientists and mathematicians have given as many useful gifts. An example would be the computer, given to us by Turing and von Neumann and their buddies. Decades ago you had to be very clever indeed to use computers, but now many people can program them. Statisticians should make statistics similarly easy to use for guys like me, so we can get the answers we need to treat our patients. That so many researchers bugger it up is partly the fault of statisticians failing to provide us with the right tools to do the job properly. Remember, statisticians cannot do clinical research by themselves, one need the right clinical questions asked, the experiment designed guided by sensible clinical considerations, and the answers framed by the clinical context. I get tired of statisticians being smug about others buggering up statistical analysis. Of course the busy cardiologist is not an expert statistician; he spent half his adult life studying cardiology (or whatever)!
    About your challenge to explain why an RCT is the best guess, I can only regurgitate what I learnt, which seemed to make sense to me when I learnt it. But I would not want to copy and paste from Wikipedia, so I will spare you the detail. Briggs, do you believe that a well conducted RCT can give (or is likely to give) a good answer to an important healthcare question? A straight answer please. Thanks for the posting, does stimulate my thinking (and I am now confused).



  6. Scotian

    I believe Briggs that we have two different approaches to explanation. At the risk of incurring your wrath I will describe them as follows. I tend to the short, concise, and fully packed. If you don’t understand it at first, read it again. Of course this reflects my limited writing skills and there is the very real chance of never understanding what I am trying to say. I believe that your approach tends to the lengthy. The matter is explained in detail so that no stone is left unturned but clarity is purchased at the price of dilution. If this trend were extended, at some point every contingency would be covered but the content per sentence would be close to zero and rereading would be out of the question. We (I) need a summery for policy makers. :}

  7. Briggs



    Describe for me, in 750 words, the philosophy of quantum mechanics. Make sure you leave no question unanswered, and do not allow the chance that somebody may misinterpret what you say.


    Good thing you resisted the temptation to quote. It would have been wrong. Go to the Classic Posts page and look up “random” and search for those articles which show “randomized” trials are not needed (a well known argument, to us insiders, anyway). It’s control which is important. And control, ultimate control, is what we have with the Bobs.

    As we have seen, even with ultimate control, much ambiguity remains. Why? What can we know? Well, stick around.

  8. Briggs


    I once sat in on a proof that took three weeks (well, we only met for two hours each week). Course on “long memory”. “Students” were three grad students (me and two another guys) and a three to four professors. One of the profs, midway through the proof, was reduced to crying, “You’re killing us!”

    Not everything is easy.

  9. Nullius in Verba

    “We (I) need a summary for policy makers. :}”

    Shorter Briggs: you can always find unreasonable nits to pick in any theory; here I do so for hypothesis testing.

    “Describe for me, in 750 words, the philosophy of quantum mechanics.”

    Everything that can happen, does happen.

    Six words. 🙂

    ” “randomized” trials are not needed (a well known argument, to us insiders, anyway). It’s control which is important.”

    It’s independence which is important. The purpose in both randomization and control is to guarantee that any extraneous factors are statistically independent of the factor being studied.

  10. Briggs


    Yet everything that can happen, doesn’t happen.

    And “independence” is not important (if the word is taken in its technical sense). As the press would say, developing…

  11. Nullius in Verba

    Oh yes it does! Even if you can’t see it.

    “Important” isn’t the same thing as “essential”, of course. But I shall, as the Press would say, await developments… 🙂

  12. Scotian

    Actually NIV & Briggs a better description of QM is:

    Everything that is not allowed is forbidden.

    Also thanks for correcting my spelling. I like to put one in every post in order to avoid the jealousy of the Gods.

  13. Nullius in Verba

    It’s also an allusion to T.H. White’s ‘The Once and Future King’: “Everything not forbidden is compulsory”, supposedly written above the entrance to an ants nest, as the motto of a rigid, rule-bound, totalitarian society.

    But that’s about what ought to happen, not what does happen.

  14. JH


    … statisticians being smug about others buggering up statistical analysis.

    How did you get such an impression? I do understand your frustration.
    If you are in an academic environment, take a statistic class, or see a statistics professor or someone in the statistical consulting center! Some of us like to explain statistics to others and to learn from experts in different areas. What’s more rewarding than being able to learn from experts from other areas? Who gets chance to do that? No philosophy. Though we do want to know what it means to say whether a treatment can cure a cancer for non-philosophical reasons. It’s a different question than whether a treatment can work. Well-designed clinical trials can be good at assessing the efficacy of a treatment, i.e., whether it can work. There are differences among the concepts of efficacy, effectiveness and safety of drugs. (You probably already know this.)

  15. Nullius in Verba


    Problem is, they learned how to do it in a statistics class, where somebody explained what to do but not why. Then when they do it just as they’ve been told, they get statisticians popping up and saying: ‘Technically, that’s not valid in these circumstances because [abstruse and incomprehensible reasons]. You should leave the thinking to the statisticians.’

    I did it myself the other day. We were talking about the IPCC’s confidence bounds based on a trend+AR(1) model and if you chose another model the confidence interval would contain zero, and some guy turned up to complain, saying he’d just calculated the confidence intervals and they didn’t contain zero.

    I had to explain to him that he’d just used straight OLS, which implicitly assumes Gaussian zero-mean independent, homoscedastic additive errors, blah, blah, blah…, and the ‘errors’ in climate data weren’t. But I didn’t blame him, I blamed his teachers.

    So a lot of practitioners have become aware that there are limits to the methods they’ve been taught, and are anxious to know more about them so as to avoid career-wrecking embarrassment, or worse. But on this occasion I think Briggs is just enjoying a rant about some rather difficult philosophical issues (the problem of induction, the definition of causality) that don’t really have any serious import for the practitioner – or at least, I haven’t seen anything yet that does – and Francois is irritated at the implied criticism of his profession. Yes, there are ways RCT can go wrong, it’s not infallible – but so what? It’s good enough for our purposes. We’re not pretending to be the Pope.

  16. Briggs


    I can’t be pope. I don’t even own a mitre.

    But those curious things about causality, induction and all that (I hope to show) have direct and serious bearing on what’s good enough and what isn’t. In any case, Briggs isn’t the only one who cares about these things. The pope does, too!

    RCT just aren’t right. That “randomness” bit is silly, as has often been validly argued. What matters is control. That’s why experiments “work”—when they do. Control, control, control. But even then, even with perfect control, there is ambiguity. So that’s our base. We can build on that all kinds of things, some of which turn out to be the same as in practice now, others radically different.

    And, of course, as regular readers know, even and often in practice RCTs are horrible in the sense of what evidence people think they see in them. In real life what’s used is far, far from “good enough.”

  17. Nullius in Verba

    ” That “randomness” bit is silly, as has often been validly argued.”

    I had a look and I didn’t think so. You started from the premise that the reason for using randomization was to get a roughly equal distribution of characteristics between the groups, and then showed that that didn’t work, and therefore the justification was bogus. Which is all very fine, except that I don’t consider that to be the reason for using randomization.

    The problem an RCT is trying to get over is that correlation doesn’t imply causation, because you can either have the causal arrow pointing the other way (outcome causes treatment) or arrows pointing to both from a common cause, and the correlation looks the same. The point of RCT is to make this impossible by making sure you *know* what caused the treatment, and that it was causally independent of the outcome or anything else. A control does this by making sure the *experimenter* is the cause. And because even experimenters can be influenced by other factors, the experimenter uses a random number source *known to be causally independent* of any potential confounders – like dice. Note that it is not simply *randomness* that is required, or you could use the order in which the people come in the door or their race and sex as your randomness generator. It relies on our belief that nothing about the experimental subjects could possibly influence the throw of the dice, so any correlation *cannot* be explained by the outcome or the confounders affecting the dice.

    Then you can use straightforward probability theory to calculate what you should see and how much variation you should expect under each hypothesis, which takes into account all those imbalances you mentioned. Such calculations are far easier if you can assume perfect independence, and only possible if you can bound the dependence, which is tricky if there are potentially unknown unknowns.

    For this argument to work, it doesn’t matter if the experimenter doesn’t have control over the input, so long as whatever *does* have control is known, and is known to be causally independent. Nor is it simply a matter of having a ‘control group’ of the untreated to compare against, or you could simply ask the subjects whether they had already taken the treatment, and divide them accordingly.

    I don’t really see this as a disagreement, since you said more or less the same thing in your footnote to the randomization post – preventing cheating is a good reason to keep randomization. It’s just that I see that as the intended purpose, rather than an incidental benefit.

  18. Briggs


    Nah, that doesn’t work. Control, not “randomization”, is how you know what caused what, as our examples with the Bobs show. “Random” just means “unknown.” You can’t find certainty by adding certainty.

    I do say to keep “randomization” in some instances, but not because of any mystical “randomness” gifted to the experiment, but because (some) people are cheats and liars and can’t be trusted. What I mean by “randomization” is to keep certain aspects of control out of the hands of experimenters. It would save them from fooling themselves.

    But that means control can be put in the hands of others not interested in the results. That doesn’t mean that that group would shake voodoo sticks (generate “random” numbers) but would ensure the groups are properly controlled, to the best of their ability.

    And even when control is perfect etc., etc.

  19. JH


    I don’t know if Francois is irritated at the implied criticism of his profession. However, I do want Francois to know that not all statisticians behave the way as he described. He seems to understand Briggs’ points, and he also wants to learn about statistical tools and something more that would allow him to sort through data. Which is a good thing!


    “Random” just means “unknown,” however, “randomization” doesn’t mean “unknown.”

  20. Briggs


    Amen, sister. “Randomization” means “creating the unknown.”

  21. JH

    Yet, I see randomization as to keep certain aspects of control in the hands of experimenters.

  22. JH

    Creating unknown? Explain!

  23. JH

    Specifically, how do you create unknown using randomization?

  24. JH

    And how can “control” let you know what causes what? You wrote “to ensure the groups are properly controlled.” There is a so-called “control group” in a scientific experiment. So what does “control” mean exactly?

  25. Francsois


    I remember your quote the other day by David Stove where he laments the apparent fact that his writings seem to have no impact. You had a similar complaint. Now remember, most people are a bit slow or wilfully ignorant (or both, including myself). They are not going to read Stove’s attack on Popper, and they are not going to read Briggs criticism of RCT’s. So writing obscure books or posting stuff that is hard to follow on your blog will not change the world. So how would one change things if you insist on writing as a way to get changes going? By writing a book that shows people what they should be doing, in a practically useful way. A great example of such a book is the well known book on logistic regression by Hosmer and Lemeshow http://www.amazon.com/Applied-Logistic-Regression-Probability-Statistics/dp/0470582472 I refer to the book because to illustrate my point, I am sure Briggs would disagree with all written in it.

    JH, thanks for your points raised. There are a lot of self-righteous statisticians around, and quite a few that are not. If you believe that we should strive to live a good and useful life, as Marcus Aurelius and his friends suggest we should, then your should strive to fix things if they are broken. As I see it Briggs has his job cut out for himself; he has an obligation to fix the stuff he moans about.



  26. Briggs


    “They are not going to read…and they are not going to read…”

    Well, you have me there.

    (But you did download that free book I mentioned when you made a similar comment a couple of weeks ago, right?)

    New book is in the works.

  27. Francsois

    I bought your book as a hardcopy, I liked it. Wish it had answers to homework questions. Looking forward to your new book, will buy it as soon as it is available.



  28. Random Gator

    I feel as if you should have continued with a part V and perhaps even a part VI.

    Please discuss case 3 where all treatment Bobs are cured and no placebo Bobs are cured. Does our treatment cure cancer of the albondigas given this scenario (i.e., is the proposition that our treatment cures cancer true with these premises)?

    Please also discuss the following scenario: We don’t have a supply of identical Bobs so we use ordinary non- identical people all with cancer of the albondigas and randomize them to two groups: treatment and placebo. Then suppose a certain percentage of treatment group are cured and a lower percentage of placebo group are cured. Is there anything useful we can conclude from this outcome? Why is it not useful to supply statistical tests to decide if the difference in response rate is “statistically significant”?

Leave a Reply

Your email address will not be published. Required fields are marked *