Decision Calculator

This is just a rough prototype meant to be easy to play with inside a post. READ the help and guidebook! Suggestions for new canned examples welcome—the hard part is deriving historical performance data.

Rules

  1. Read the Decision Calculator guidebook below!
  2. Fill in the Performance Table, or click on one of the predefined examples.
  3. Fill in the Cost Comparison Table, or click on one of the predefined examples. You do not need to calculate the total: that’s done automatically.
  4. Click Calculate (or Reset between examples).
  5. Accuracy comparison rates are given between the Expert System and the Naive Guess.
  6. Cost results are found in the Expected Cost Comparison Table.
  7. Finally, a solution saying which option you should choose is given. Skill should be > 0!
  8. Important: Use this software at your own risk. No warranties of any kind are given or implied. Always consult a competent medical professional. .

Preset examples:

See Below HELP HELP HELP

 

1. Fill in the Historical Performance Table.

Present Absent
Test +
Test –

 

2. Fill in the Cost Comparison Table.

False Positive Costs Score False Negative Costs Score
Total: Total:

 

3. Click calculate (or Reset between examples).


 

4. The Optimal Naive Guess is to:

 

5. Accuracy (%) Comparison Table

Test Accuracy
Expert System
Naive Guess

 

6. Expected Costs Comparison Table

Test Expected False Positive Cost Expected False Negative Cost Total
Expert System
Naive Guess

 

7. The skill score is:


It should be greater than 0 for a skillful test!

 

8. The solution:

 

GUIDEBOOK

This article provides you with an introduction and a step-by-step guide of how to make good decisions in particular situations. These techniques are invaluable whether you are an individual or a business.

These results hold for all manner of examples—from deciding whether to have a PSA test or mammography, to get a vaccine, to finding a good stock broker or movie reviewer, to situations that require intense statistical modeling, to financial forecasts, to lie detector usefulness. Any situation that has a dichotomous outcome can use these techniques.

Many people opt for precautionary medical tests—frequently because a television commercial or magazine article scares them into it. What people don’t realize is that these tests have hidden costs. These costs are there because tests are never 100% accurate. So how can you tell when you should take a test?

When is worth it?

Under what circumstances is it best for you to receive a medical test? When you “Just want to be safe”? When you feel, “Why not? What’s the harm?”

In fact, these are not good reasons to undergo a medical test. You should only take a test if you know that it’s going to give you useful information. You want to know the test performs well and that it makes few mistakes, mistakes which could end up costing you emotionally, financially, and even physically.

Let’s illustrate this by taking the example of a healthy woman deciding whether or not to have a mammogram to screen for breast cancer. She read that all women over 40 should have this test “Just to be sure.” She has heard lots of horror stories about breast cancer. Testing almost seems like a duty. She doesn’t have any symptoms of breast cancer and is in good health. What should she do?

What can happen when she takes this (or any) medical test? One of four things:

  1. The test could correctly indicate that no cancer is present. This is good. The patient is assured.
  2. The test could correctly indicate that a true cancer is present. This is good in the sense that treatment options can be investigated immediately.
  3. The test could falsely indicate no cancer is present when it truly is. This error is called a false negative. This is bad because it could lead to false hope and could cause the patient to ignore symptoms because, “The test said I was fine.”
  4. The test could falsely indicate that cancer is present when it truly is not. This error is called a false positive. This is bad because it is distressing and could lead to unnecessary and even harmful treatment. The test itself, because it uses radiation, even increases the risk of true cancer because of the unnecessary exposure to x-rays.

This table shows all the possibilities in a test for the presence of absence of a thing (like breast cancer, prostate cancer, a lie, AIDS, and so on). For mammograms, “Present” means that cancer is actually there, and “Absent” means that no cancer is there. For a PSA test, “Present” means a prostate cancer is actually there, and “Absent” means that it is not. For a Movie Reviewer, “Present” means you liked a movie, and “Absent” means you did not.

Test Table
Present Absent
Test + Good: True Positive Bad: False Positive
Test – Bad: False Negative Good: True Negative

“Test +” says that the test indicates the test said the thing (cancer) is present. “Test -” says that the test indicates the absence of the thing. For the Movie Reviewer example, “Test +” means the reviewer recommended a film.

There are two cells in this graph that are labeled “Good,” meaning the test has performed correctly. The other two cells are labeled “Bad,” meaning the test has erred. Study this table to be sure you understand how to read it because it will be used throughout this article.

 

Error everywhere

The main point is this: all tests and all measurements have some error. There is no such thing as a perfect test or perfect measurement! Mistakes always happen. This is an immutable law of the universe. Some tests are better than others, and tables like this are necessary to understand how to rate how well a particular test performs.

 

Can having a mammogram kill you? How to make decisions under uncertainty.

The answer to the headline is, unfortunately, yes. The Sunday, 10 February 2008 New York Post reported this sad case of a woman at Mercy Medical Center in New York City. The young woman went to the hospital and had a mammogram, which came back positive, indicating the presence of breast cancer (she also had follow-up tests). Since other members of her family had experienced this awful disease, the young woman opted to have a double mastectomy and to have have implants inserted after this. All of which happened. She died a day after the surgery.

That’s not the worst part. It turns out she didn’t have cancer after all. Her test results had been mixed up with some other poor woman’s. So if she never had the mammogram in the first place, and made a radical decision based on incorrect test results, the woman would not have died. So, yes, having a mammogram can lead to your death. It is no good arguing that this is a rare event—adverse outcomes are not so rare, anyway—because all I was asking was can a mammogram kill you. One case is enough to prove that it can.

But aren’t medical tests, and mammograms in particular, supposed to be error free? What about prostate exams? Or screenings for other cancers? How do you make a decision whether to have these tests? How do you account for the possible error and potential harm resulting from this error?

I hope to answer all these questions in the following article, and to show you how deciding whether to take a medical exam is really no different than deciding which stock broker to pick. Some of what follows is difficult, and there is even some math. My friends, do not be dissuaded from reading. I have tried to make it as easy to follow as possible. These are important, serious decisions you will someday have to make: you should not treat them lightly.

Decision Calculator

You can download a (non-updated) pdf version of this paper here.

This article will provide you with an introduction and a step-by-step guide of how to make good decisions in particular situations. These techniques are invaluable whether you are an individual or a business.

The results that you’ll read about hold for all manner of examples—from lie detector usefulness, to finding a good stock broker or movie reviewer, to intense statistical modeling, to financial forecasts. But a particularly large area is medical testing, and it is these kinds of tests that I’ll use as examples.

Many people opt for precautionary medical tests—frequently because a television commercial or magazine article scares them into it. What people don’t realize is that these tests have hidden costs. These costs are there because tests are never 100% accurate. So how can you tell when you should take a test?

When is worth it?

Under what circumstances is it best for you to receive a medical test? When you “Just want to be safe”? When you feel, “Why not? What’s the harm?”

In fact, none of these are good reasons to undergo a medical test. You should only take a test if you know that it’s going to give accurate results. You want to know that it performs well, that is, that it makes few mistakes, mistakes which could end up costing you emotionally, financially, and even physically.

Let’s illustrate this by taking the example of a healthy woman deciding whether or not to have a mammogram to screen for breast cancer. She read in a magazine that all women over 40 should have this test “Just to be sure.” She has heard lots of stories about breast cancer lately. Testing almost seems like a duty. She doesn’t have any symptoms of breast cancer and is in good health. What should she do?

What can happen when she takes this (or any) medical test? One of four things:

You cannot measure a mean

I often say---it is even the main theme of this blog---that people are too certain. This is especially true when people report results from classical statistics, or use classical methods…

Stats 101: Chapter 4

Chapter 4 is ready to go.

This is where it starts to get weird. The first part of the chapter introduces the standard notation of “random” variables, and then works through a binomial example, which is simple enough.

Then come the so-called normals. However, they are anything but. For probably most people, it will be the first time that they hear about the strange creatures called continuous numbers. It will be more surprising to learn that not all mathematicians like these things or agree with their necessity, particularly in problems like quantifying probability for real observable things.

I use the word “real” in its everyday, English sense of something that is tangible or that exists. This is because mathematicians have co-opted the word “real” to mean “continuous”, which in an infinite amount of cases means “not real” or “not tangible” or even “not observable or computable.” Why use these kinds of numbers? Strange as it might seem, using continuous numbers makes the math work out easier!

Again, what is below is a teaser for the book. The equations and pictures don’t come across well, and neither do the footnotes. For the complete treatment, download the actual Chapter.

Distributions

1. Variables

Recall that random means unknown. Suppose x represents the number of times the Central Michigan University football team wins next year. Nobody knows what this number will be, though we can, of course, guess. Further suppose that the chance that CMU wins any individual game is 2 out of 3, and that (somewhat unrealistically), a win or loss in any one game is irrelevant to the chance that they win or lose any other game. We also know that there will be 12 games. Lastly, suppose that this is all we know. Label this evidence E. That is, we will ignore all information about who the future teams are, what the coach has leaked to the press, how often the band has practiced their pep songs, what students will fail their statistics course and will thus be booted from the team, and so on. What, then, can we say about x?

We know that x can equal 0, or 1, or any number up to 12. It’s unlikely that CMU will loss or win every game, but they?ll prob ably win, say, somewhere around 2/3s, or 6-10, of them. Again, the exact value of x is random, that is, unknown.

Now, if last chapter you weren?t distracted by texting messages about how great this book is, this situation might feel a little familiar. If we instead let x (instead of k?remember these letters are place holders, so whichever one we use does not mat
ter) represent the number of classmates you drive home, where the chance that you take any of them is 10%, we know we can figure out the answer using the binomial formula. Our evidence then was EB . And so it is here, too, when x represents the number of games won. We?ve already seen the binomial formula written in two ways, but yet another (and final) way to write it is this:

x|n, p, EB ? Binomial(n, p).

This (mathematical) sentence reads “Our uncertainty in x, the number of games the football team will win next year, is best represented by the Binomial formula, where we know n, p, and our information is EB .” The “?” symbol has a technical definition: “is distributed as.” So another way to read this sentence is “Our uncertainty in x is distributed as Binomial where we know n, etc.” The “is distributed as” is longhand for “quantified.” Some people leave out the “Our uncertainty in”, which is OK if you remember it is there, but is bad news otherwise. This is because people have a habit of imbuing x itself with some mystical properties, as if “x” itself had a “random” life. Never forget, however, that it is just a placeholder for the statement X = “The team will win x games”, and that this statement may be true or false, and it?s up to us to quantify the probability of it being true.

In classic terms, x is called a “random variable”. To us, who do not need the vague mysticism associated with the word random, x is just an unknown number, though there is little harm in calling it a “variable,” because it can vary over a range of numbers. However, all classical, and even much Bayesian, statistical theory uses the term “random variable”, so we must learn to work with it.

Above, we guessed that the team would win about 6-10 games. Where do these number come from? Obviously, based on the knowledge that the chance of winning any game was 2/3 and there?d be twelve games. But let?s ask more specific questions. What is the probability of winning no games, or X = “The team will win x = 0 games”; that is, what is Pr(x = 0|n, p, EB )? That’s easy: from our binomial formula, this is (see the book) ? 2 in a million. We don’t need to calculate n choose 0 because we know it?s 1; likewise, we don?t need to worry about 0.670^0 because we know that?s 1, too. What is the chance the team wins all its games? Just Pr(x = 12|n, p, EB ). From the binomial, this is (see the book) ? 0.008 (check this). Not very good!

Recall we know that x can take any value from zero to twelve. The most natural question is: what number of games is CMU most likely to win? Well, that’s the value of x that makes (see the book) the largest, i.e. the most probable. This is easy for a computer to do (you’ll learn how next Chapter). It turns out to be 8 games, which has about a one in four chance of happening. We could go on and calculate the rest of the probabilities, for each possible x, just as easily.

What is the most likely number of games the team will win is the most natural question for us, but in pre-computer classical statistics, there turns out to be a different natural question, and this has something to do with creatures called expected values. That term turns out to be a terrible misnomer, because we often do not, and cannot, expect any of the values that the “expected value” calculations give us. The reason expected values are of interest has to do with some mathematics that are not of especial interest here; however, we will have to take a look at them because it is expected of one to do so.