Tuesday, July 15, 2008

Beyond a Reasonable Doubt

In Psychology of Intelligence Analysis, Richards Heuer advocates that we quantify expressions of uncertainty: "To avoid ambiguity, insert an odds ratio or probability range in parentheses after expressions of uncertainty in key judgments."

His suggestion reminds me of my pet peeve about the unquantified notion of reasonable doubt in the American justice system. I've always wanted (but never had the opportunity) to ask a judge what probability of innocence constitutes a reasonable doubt.

Unfortunately, as Heuer himself notes elsewhere in his book, we human beings are really bad at estimating probabilities. I suspect (with a confidence of 90 to 95%) that quantifying our uncertainties as probability ranges will only suggest a false sense of precision.

So, what can we do to better communicate uncertainty? Here are a couple of thoughts:
  • We can calibrate estimates based on past performance. It's unclear what will happen if people realize that their estimates are being translated, but, at worst, it feels like good fodder for research in judgment and decision making.

  • We can ask people to express relative probability judgments. While these are also susceptible to bias, at least they don't demand as much precision. And we can always vary the framing of questions to try to factor out the cognitive biases they induce.
Also, we talk about uncertainty, it is important that we distinguish between aleatory and epistemic uncertainty.

When I flip a coin, I am certain it has a 50% chance of landing heads, because I know the probability distribution of the event space. This is aleatory uncertainty, and forms the basis of probability and statistics.

But when I reason about less contrived uncertain events, such as estimating the likelihood that my bank will collapse this year, the challenge is my ignorance of the probability distribution. This is epistemic uncertainty, and it's a lot messier.

If you'd like to learn more about aleatory and existential uncertainty, I recommend Nicholas Nassim Taleb's Fooled by Randomness (which is a better read than his better-known Black Swan).

In summary, we have to accept the bad news that the real world is messy. As a mathematician and computer scientist, I've learned to pursue theoretical rigor as an ideal. Like me, you may find it very disconcerting to not be able to treat all real-world uncertainty in terms of probability spaces. Tell it to the judge!

2 comments:

stefanoq said...

Daniel,

I was happy to see you discuss this topic. I wonder if you would relate these principles of uncertainty to your work in the area of search? Isn't a relevance "score" really a probability that a given document is relevant in the context of a query? Are there benchmarks against which one might evaluate a relevance score generated from a predictive engine and some sort of actual or real relevance based on usage like your coin toss example?

Daniel Tunkelang said...

Actually, the interesting thing about relevance scoring is that most approaches do not return a probability of relevance, but only promise to rank documents in an order corresponding to their probability of relevance. The difference is subtle but significant, since it means you can't use the scores to establish a meaningful cutoff between relevant and irrelevant documents.

In response to the broader question of relating our understanding of uncertainty to search, I was thinking more broadly about using information to make decisions. Search is simply a tool--what we're often doing, consciously or unconsciously, is testing hypotheses or generating conjectures. We're generally doing so with incomplete and noisy data, which means our analysis has to tolerate some degree of uncertainty. Hence, our ability to perform this analysis effectively relies on our ability to articulate and reason about uncertainty.

Tuesday, July 15, 2008

Beyond a Reasonable Doubt

In Psychology of Intelligence Analysis, Richards Heuer advocates that we quantify expressions of uncertainty: "To avoid ambiguity, insert an odds ratio or probability range in parentheses after expressions of uncertainty in key judgments."

His suggestion reminds me of my pet peeve about the unquantified notion of reasonable doubt in the American justice system. I've always wanted (but never had the opportunity) to ask a judge what probability of innocence constitutes a reasonable doubt.

Unfortunately, as Heuer himself notes elsewhere in his book, we human beings are really bad at estimating probabilities. I suspect (with a confidence of 90 to 95%) that quantifying our uncertainties as probability ranges will only suggest a false sense of precision.

So, what can we do to better communicate uncertainty? Here are a couple of thoughts:
  • We can calibrate estimates based on past performance. It's unclear what will happen if people realize that their estimates are being translated, but, at worst, it feels like good fodder for research in judgment and decision making.

  • We can ask people to express relative probability judgments. While these are also susceptible to bias, at least they don't demand as much precision. And we can always vary the framing of questions to try to factor out the cognitive biases they induce.
Also, we talk about uncertainty, it is important that we distinguish between aleatory and epistemic uncertainty.

When I flip a coin, I am certain it has a 50% chance of landing heads, because I know the probability distribution of the event space. This is aleatory uncertainty, and forms the basis of probability and statistics.

But when I reason about less contrived uncertain events, such as estimating the likelihood that my bank will collapse this year, the challenge is my ignorance of the probability distribution. This is epistemic uncertainty, and it's a lot messier.

If you'd like to learn more about aleatory and existential uncertainty, I recommend Nicholas Nassim Taleb's Fooled by Randomness (which is a better read than his better-known Black Swan).

In summary, we have to accept the bad news that the real world is messy. As a mathematician and computer scientist, I've learned to pursue theoretical rigor as an ideal. Like me, you may find it very disconcerting to not be able to treat all real-world uncertainty in terms of probability spaces. Tell it to the judge!

2 comments:

stefanoq said...

Daniel,

I was happy to see you discuss this topic. I wonder if you would relate these principles of uncertainty to your work in the area of search? Isn't a relevance "score" really a probability that a given document is relevant in the context of a query? Are there benchmarks against which one might evaluate a relevance score generated from a predictive engine and some sort of actual or real relevance based on usage like your coin toss example?

Daniel Tunkelang said...

Actually, the interesting thing about relevance scoring is that most approaches do not return a probability of relevance, but only promise to rank documents in an order corresponding to their probability of relevance. The difference is subtle but significant, since it means you can't use the scores to establish a meaningful cutoff between relevant and irrelevant documents.

In response to the broader question of relating our understanding of uncertainty to search, I was thinking more broadly about using information to make decisions. Search is simply a tool--what we're often doing, consciously or unconsciously, is testing hypotheses or generating conjectures. We're generally doing so with incomplete and noisy data, which means our analysis has to tolerate some degree of uncertainty. Hence, our ability to perform this analysis effectively relies on our ability to articulate and reason about uncertainty.

Tuesday, July 15, 2008

Beyond a Reasonable Doubt

In Psychology of Intelligence Analysis, Richards Heuer advocates that we quantify expressions of uncertainty: "To avoid ambiguity, insert an odds ratio or probability range in parentheses after expressions of uncertainty in key judgments."

His suggestion reminds me of my pet peeve about the unquantified notion of reasonable doubt in the American justice system. I've always wanted (but never had the opportunity) to ask a judge what probability of innocence constitutes a reasonable doubt.

Unfortunately, as Heuer himself notes elsewhere in his book, we human beings are really bad at estimating probabilities. I suspect (with a confidence of 90 to 95%) that quantifying our uncertainties as probability ranges will only suggest a false sense of precision.

So, what can we do to better communicate uncertainty? Here are a couple of thoughts:
  • We can calibrate estimates based on past performance. It's unclear what will happen if people realize that their estimates are being translated, but, at worst, it feels like good fodder for research in judgment and decision making.

  • We can ask people to express relative probability judgments. While these are also susceptible to bias, at least they don't demand as much precision. And we can always vary the framing of questions to try to factor out the cognitive biases they induce.
Also, we talk about uncertainty, it is important that we distinguish between aleatory and epistemic uncertainty.

When I flip a coin, I am certain it has a 50% chance of landing heads, because I know the probability distribution of the event space. This is aleatory uncertainty, and forms the basis of probability and statistics.

But when I reason about less contrived uncertain events, such as estimating the likelihood that my bank will collapse this year, the challenge is my ignorance of the probability distribution. This is epistemic uncertainty, and it's a lot messier.

If you'd like to learn more about aleatory and existential uncertainty, I recommend Nicholas Nassim Taleb's Fooled by Randomness (which is a better read than his better-known Black Swan).

In summary, we have to accept the bad news that the real world is messy. As a mathematician and computer scientist, I've learned to pursue theoretical rigor as an ideal. Like me, you may find it very disconcerting to not be able to treat all real-world uncertainty in terms of probability spaces. Tell it to the judge!

2 comments:

stefanoq said...

Daniel,

I was happy to see you discuss this topic. I wonder if you would relate these principles of uncertainty to your work in the area of search? Isn't a relevance "score" really a probability that a given document is relevant in the context of a query? Are there benchmarks against which one might evaluate a relevance score generated from a predictive engine and some sort of actual or real relevance based on usage like your coin toss example?

Daniel Tunkelang said...

Actually, the interesting thing about relevance scoring is that most approaches do not return a probability of relevance, but only promise to rank documents in an order corresponding to their probability of relevance. The difference is subtle but significant, since it means you can't use the scores to establish a meaningful cutoff between relevant and irrelevant documents.

In response to the broader question of relating our understanding of uncertainty to search, I was thinking more broadly about using information to make decisions. Search is simply a tool--what we're often doing, consciously or unconsciously, is testing hypotheses or generating conjectures. We're generally doing so with incomplete and noisy data, which means our analysis has to tolerate some degree of uncertainty. Hence, our ability to perform this analysis effectively relies on our ability to articulate and reason about uncertainty.