From a colleague (who wishes to remain anonymous) comes this poser:

Suppose that we have a box containing three objects: one red ball and two identical blue cubes. One object is randomly selected from the box, but we do not get to see or touch it. The object is put into a (low-quality) color-measuring device, which tells us that there is a 50% chance that the object is red and a 50% chance that the object is blue. What is the chance that the object is a ball?

I’ll give you some hints.

1. The words “randomly selected” are equivalent to just “selected”. This change allows you to deduce the chance of pulling out any object. I.e., since there is no information about *how* the objects are selected, particularly in how the objects’ characteristics (color, shape, size) might influence this selection, we are not justified in including any of this in our premises and thus we can say with what probability any object is selected.

2. The main question is “What is the chance that the object is a ball?” but since the ball is red and the cubes blue, and the measurement device has no premises about the shape of the object, we can rephrase the main question to, “What is the chance that the object is red?” Can that be answered?

Busy day here at Briggsville, so this is a much help as you get.

**Update** Forgot to say: I’ll post my solution tomorrow. Easy answer, difficult derivation.

**Solution**

All probability is conditional (as are all statements of logic). We want

(1) Pr(R | Machine says probability of R = 1/2, E)

were E are the details in the paragraph which were given to us and R means red, B means blue, and “Machine says probability of R = 1/2” is the information that we have. This is a standard measurement error problem. Shorten the latter to MSPRH, since it’s easier to write. Now (1) is

(1′) Pr(R | MSPRH, E) = Pr(R, MSPRH | E) / Pr(MSPRH | E)

and that equals

(2) Pr(MSPRH| R, E) x Pr(R | E) / Pr(MSPRH | E)

and we expand the denominator

(3) Pr(MSPRH| R, E) x Pr(R | E) / [Pr(MSPRH| R, E) x Pr(R | E) + Pr(MSPRH| B, E) x Pr(B | E)]

Because of course Pr(MSPRH | E) = Pr(MSPRH , R| E) + Pr(MSPRH , B| E), etc.

Now

(4) Pr(MSPR = 1/2 | R, E) = 1 – Pr(MSPR n.e. 1/2 | R, E)

where “n.e.” means not equal. And this may be expanded

(4′) Pr(MSPR = 1/2 | R, E) = 1 – Pr(MSPR = p_{1} | R, E) – Pr(MSPR = p_{2} | R, E) – … – Pr(MSPR = p_{n} | R, E)

where every term on the r.h.s. except “MSPR = 1/2” appears. The p_{i}, i = 1, 2,…,n are just those values which the machine can display. These are, of course, discrete and finite (imagine the dial or print out). But we can even soften that requirement below.

We are not given any information about the machine’s capabilities, *except* that it can at least spit out “the probability of R = 1/2”. We inferred from this that it could have said, “the probability of R = p_{i}.” And from this, since we have no other information, and because of the symmetry of individual constants, we can say that

(5) Pr(MSPR = p_{i} | R, E) = Pr(MSPR = p_{j} | R, E)

So this gives us

(6) Pr(MSPR = 1/2 , R| E) = 1 – (n-1)/n = 1/n

since there are n different values the machine may display. Plus these into (3) and we get

(7) [ (1/n) x (1/3) ]/ [(1/n) x (1/3) + (1/n) x (2/3)]

since Pr(R | E) = 1/2 and Pr(B | E) = 2/3, via E. If n is finite, then (7) = 1/3; or if we take (7) to the limit in n, it is still equal to 1/3. This also assumes the same argument holds for B in (5); that is:

(8) Pr(MSPR = p_{i} | B, E) = Pr(MSPR = p_{j} | B, E).

The other way to go at this is to think that the machine *always* spits out the *same* probability when confronted with an R or B object. That is

(9) Pr(MSPR = 1/2 | R, E) == p_{R} = (1, 0)

where “==” means equivalent, and

(10) Pr(MSPR = 1/2 | B, E) == p_{B} = (0, 1)

and p_{R} and p_{B} are *fixed* at those values, not variable, then (7) becomes

(7′) [ p_{R} x (1/3) ]/ [p_{R} x (1/3) + p_{B} x (2/3)]

which may be re-written as

(11) 1/ [1 + 2 x p_{B}/p_{R} ]

Now, we *deduce* that p_{R} = p_{B} = 0 cannot happen simultaneously. After all, the machine did say “MSPR = 1/2” and the object has to be R or B. And that means the probability of Pr (R | MSPR = 1/2, E) = 0, 1, or 1/3, depending on if p_{A} = 0 & p_{B} = 1; if p_{A} = 1 & p_{B} = 0; or if p_{A} = 1 & p_{B} = 1 respectively.

The only information we have on p_{A} and p_{B} is that they may be 0 or 1, so the probability they take those values (given this information and via the statistical syllogism) is 1/2 each. Thus Pr (R | MSPR = 1/2, E) = 0 x (1/4) + 1 x (1/4) + (1/3) x (1/4) = 1/4 + 1/12 = 4/12 = 1/3; where the (1/4) comes from the probability that Pr(p_{A} = i , p_{B} = j | E) = Pr(p_{A} = i | p_{B} = j | E) x Pr(p_{B} = j | E), where i,j = 0,1. If i = j = 0 then Pr(p_{A} = i | p_{B} = j | E) = 0. Otherwise, Pr(p_{A} = i | p_{B} = j | E) = Pr(p_{A} = i | E) since we have no information that knowing p_{B} = j tells us anything about p_{A} = i.

So the answer is 1/3, no matter how you slice it. To do better, we need more information about the vicissitudes of the machine.

1/3. The measuring device output contains no information.

This looks like the inverse sleeping beauty problem. What if the colour-ometer says that the probability of red is 1/3 (blue is 2/3)? Is this different information than 1/2 (1/2)? What if it says with certainty that it is red? All information has to be factored in.

@Rich: It seems to me the only possible wrinkle to your logic would be if the color-measuring device and the selection are somehow coupled, and so the problem statement is that the measuring device has an even chance of registering “red” or “blue” *given the contents of the box*. But it doesn’t say that, so, as you have observed, what’s the point of the measuring device?

Perhaps I’m being dim. Are we looking at one run of this system and the machine has told us “50% chance that this object is red”. Or is this a machine that is right 50% of the time, and an object has yet to be selected?

If the first case then: if the colour machine says that it’s 50% likely that it’s red, surely it makes no difference how the object in it was selected, or from what pool? 50% chance the object is red (and red=ball). (If the colour machine were 100% accurate and said “red”, then the object is a ball, who cares how it was selected?)

If the second case, and we’re interested in the overall probability that a randomly selected object placed in this machine is a ball, then the possibilities are easy since the machine cancels itself out.

red ball (1/3) machine right (1/2) = 1/6

red ball (1/3) machine wrong (1/2) = 1/6

blue cube (2/3) machine right (1/2) = 2/6

blue cube (2/3) machine wrong (1/2) = 2/6

I make that two sixths red ball and four sixths blue cube. Object is a red ball 1/3 of the time.

The way the problem is worded means we cannot include information from the color sensor. Maybe there were red cubes but they have all be drawn. Maybe there have always been more blue cubes than red balls, and so the color sensor is highly tuned to red. Who knows? None of us, thats for sure.

Here is my best guess:

P(Red Ball) = 1/3

P(Red Ball | Sensor) = ?

The answer is: We cannot answer the question.

We can only answer the question about the probability of a red ball being drawn before it is placed in to the sensor.

I’m still groggy from the weekends festivities (and St. Paddy’s is next week!) but:

Let T=test (the machine), R=red, B=blue

and P(T|R)=chance the test will correctly ID red

and P(T|R)=chance the test will correctly ID blue

solve: P(R|T) = P(T|R)P(R)/P(T)

P(T|R)P(R) = 0.5 / 3 = 1/6

P(T) = P(T|R)P(R) + P(T|B)P(B) = (0.5/3) + (0.5 * 2 / 3) = 1/6 + 2/3 = 5/6

P(R|T) = P(T|R)P(R)/P(T) = ( 1/6 ) / ( 5/6 ) = 1/5 = 0.2

Need to check this. Doesn’t look right.

OK, correcting one math mistake I found (I’m ignoring the typos):

Let T=test (the machine), R=red, B=blue

and P(T|R)=chance the test will correctly ID red

and P(T|R)=chance the test will correctly ID blue

solve: P(R|T) = P(T|R)P(R)/P(T)

P(T|R)P(R) = 0.5 / 3 = 1/6

P(T) = P(T|R)P(R) + P(T|B)P(B) = (0.5/3) + (0.5 * 2 / 3) = 1/6 +

1/3=3/6P(R|T) = P(T|R)P(R)/P(T) = ( 1/6 ) / ( 3/6 ) = 1/3

It doesn’t seem like we know enough about this machine to answer the question. Suppose that the machine always correctly determines that red objects are red. In other words, suppose that whenever you put a red object in it, the machine says “the object is red.” Suppose that when you put in a blue object, the machine always says “there is a 50% chance that the object is red and a 50% chance that the object is blue.” Then the probability that the chosen object in the puzzle is a ball is zero.

I like SteveBrooklineMA’s answer best.

But I have to assume that the machine is unbiased, that is, over the long run, when it tells us that the probability of the ball being red is 50%, that the ball is in fact red half the time.

If the machine is unbiased, then the probability that the ball is red is 50% and since the red object is always a ball and the blue one always a cube, the probability that the object is a ball is also 50%.

Sure, if we assume that “(5) Pr(MSPR = pi | R, E) = Pr(MSPR = pj | R, E)” then the whole bit about the machine in the problem is pretty pointless.

So, the argument goes that if the machine shows 50% for one red ball, it shows 50% for every red ball. It has to then show 50% for every blue cube as well. Otherwise a reading of 50% red would be a sure sign that it is in fact red.

This seems to be the implicit assumption above as the machine tells us nothing about the color of the ball and the likelihood of it being red is the same as the likelihood of pulling a red from the urn in the first place.

I have a few problems with the logic in Briggs’ solution.

5) is a leap that cannot be made

6) Makes the assumption that P(MSPR=1/2)=P(MSPR=pi)

Between 5 and 6 Briggs has made the assumption that the machine is a random number generator, and has no skill whatsoever in indicating the color of the objects inside. At wich point the rest of the math becomes uncessary.

Too much set theory. I think that you are assuming facts not in evidence. We do not know how the shapes of the objects affect selection. Therefore we do not know the initial probability of selecting red (PR) or blue (PB). We also do not know the fraction of the time that the colour-ometer says red for a red ball (x) or red for a blue cube (y). We can assume that it says blue for a red ball with the fraction (1-x) and blue for the blue cube with the fraction (1-y) but this adds no new information. The information that we have is:

PR*x + PB*y = 1/2 and PR + PB = 1

x and y can be almost anything within the range 0 to 1. Even if we assume that x = y = 1/2 there is still only one equation with two unknowns. Therefore there is not enough information to solve for PR, the sought after parameter.

I guess I will have to chalk this up to one of those things whose subtlety will forever be out of my intellectual grasp. My simple mind says:

You choose an object that’s either red or blue (no other color). You put it in a machine that says it’s red, with a probability of 0.5 or it’s blue, with a probability of 0.5 (the unspoken assumption is that the output of the machine and the process of selecting the object are independent). That’s the same as saying “it’s either red, or it’s not red, with equally divided probability”. Big help that machine is, then, given that we have to distinguish between 2 colors — it adds no information whatsoever. Your prior is P(red) = 1/3. The machine does nothing to change your belief (is that the Bayesian-correct manner of expression?). Therefore, P(red| machine output) = P(red | no machine output) = 1/3.

I *think* that pretty much sums up Dr. Briggs’ analysis anyway, although I did get a little confused about the time MSPRH seemed to morph into MSPR=1/2, and wondered if something significant had gone on that should have been apparent to me, but wasn’t.

*Sigh* I guess that’s what it takes to be a successful professional statistician: you have to be smart enough to see why this isn’t obvious. Dumb schlubs like me will forever flail away as hapless amateurs.

Well, a machine that gives you a probability measure for an attribute in an unusual machine. For example, one that told you just the color but was expected to be correct only 50% of the time would give you some additional information which could be weighted with a prior.

Make that > 50% of the time and it’s correct.