Bayesian prior: Difference between revisions

no edit summary
No edit summary
Tags: Mobile edit Mobile web edit Advanced mobile edit
No edit summary
Tags: Mobile edit Mobile web edit Advanced mobile edit
 
(5 intermediate revisions by the same user not shown)
Line 6: Line 6:
:—Mike Tyson
:—Mike Tyson
}}
}}
{{dpn|beɪzˈiːən ˈpraɪə|n|}}A way to incorporate existing knowledge or beliefs about a parameter into statistical analysis. For example, if you believe that
{{dpn|beɪzˈiːən ˈpraɪə|n|}}A way to incorporate existing knowledge or beliefs about a parameter into statistical analysis.
:(a) all playwrights can be objectively ranked according to independent, observable criteria;
:(b) the quality of those playwrights in a given sample will be normally distributed;
and you think the best way of assessing the quality of dramas is by statistical analysis, then
:(i) you have already made several category errors, should not be talking about art, and if you are, no-one should be listening; but
:(ii) if, nonetheless, you are, and they are, and you are trying to estimate the statistical likelihood of a specific Elizabethan playwright being the best in history, then your knowledge that there were vastly fewer playwrights active in the Elizabethan period than have existed in all of history until now — which is a Bayesian prior distribution — might help you conclude that the odds of that Elizabethan playwright really being the best are vanishingly low.  


At the same time, everyone else will conclude that you have no idea about aesthetics and a fairly shaky grasp even of Bayesian statistics.
For example, if you believe that:
====The Monty Hall problem ====
{{L3}}All playwrights can be objectively ranked according to independent, observable criteria; <li>
The neatest illustration of how Bayesian priors are meant to work is the “Monty Hall” problem, named for the ghost of the gameshow Deal or No Deal and famously articulated in a letter to ''Parade'' Magazine as follows:
The quality of those playwrights in a given sample will be normally distributed; and <li>
The best way of assessing the quality of dramas is by statistical analysis </ol>
Then:
{{L3}}You have already made several category errors, should not be talking about art, and if you are, no-one should be listening; but <li>
If nonetheless you still are, and they still are, and you are trying to estimate the statistical likelihood of a specific Elizabethan playwright being the best in history, then your knowledge that there were vastly fewer playwrights active in the Elizabethan period than have existed in all of history until now — which is a Bayesian “prior distribution” — might help you conclude that the odds of that Elizabethan playwright really being the best are vanishingly low.</ol>
 
At the same time, everyone else will conclude that you have no idea about literature and a shaky grasp even of Bayesian statistics.
 
{{Drop|B|ayesian statistics have}}, in our dystopian techno-determinist age, a lot to answer for.
 
In their place they can unravel surprising odds in a game of chance that human brains intuitively misapprehend — this will help should you be asked to choose wisely between goats and cars — but outside the tight swim lanes of statistical experiment, they can be easily misapplied and may get badly lost in weighing up the risks of the market, the merits of Shakespeare, our debt to distant future generations, and the prospect of onrushing [[apocalypse]], courtesy of which, some theorists tell us, there won’t ''be'' many future generations to worry about anyway.
====Goats and sportscars====
{{Drop|T|he neatest illustration}} of how “Bayesian priors” work is the “Monty Hall” problem, named for the ghost of the gameshow ''Deal or No Deal'':
{{quote|
{{quote|
Suppose you're on a game show, and youʼre given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows whatʼs behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?}}
A game show contestant is asked to choose a prize from behind one of three doors. She is told one door conceals s a sports car and the other two goats. [''Why goats? — Ed'']
 
When the contestant has chosen, the host theatrically opens one of doors she did not choose, to reveal a goat.  
 
“Knowing what you know now, would you reconsider?”}}
 
If you have not seen it before, intuitively you may say, well, at the beginning each door carried an equal probability — 1/3 — and the remaining doors still do after the reveal — 1/2 — so while the player’s odds have ''improved'', either choice remains even. It diesn’t matter whether she sticks or twists, so she should be indifferent.
 
Bayesian probability theory shows this intuition to be ''wrong''.


If you have not encountered the problem before, the intuitive answer is to say, no: each door carried an equal probability, 1/3, of containing the car before the host opened a door, and each carries an equal probability, 1/2, afterward. To be sure the odds are better now than they were, but one should still be indifferent as to whether to switch.  
Staying put is to commit to choice you made then the odds were worse. So its odds remain the same. You have no more information about your original choice: you already knew it may or may not contain the car. You do, however, know something new about one of the doors you ''didn’t'' choose. The odds as between the other two doors change, from 1/3 each to 0/3 for the open door — it definitely ''doesnʼt'' hold the car and 2/3 for the closed one, which still might.  


Bayes says no: since the host will never open the door you chose, nor the door concealing the car, this new information tells you something about the remaining choice. Your original choice keeps its original odds of 1/3; the odds as between the other two doors change, from 1/3 each to 0/3 for the open door, which definitely doesnʼt hold the car, 2/3 for the closed one, which still might. So you should switch doors. You exchange a 1/3 risk of being right for a 1/3 risk of being wrong.  
The probabilities for the remaining options are therefore 1/3, for your original choice, and 2/3 for the other remaining door.  


This proposal outrages some people, at first. Apparently, even statisticians. But it is true. It becomes more intuitive if you adjust the thought experiment so there are one ''thousand'' doors, not three, and after your 1/1000 choice the host reveals 998 of the other doors to reveal goats and leaves one shut. ''Now'' would you switch? Clearly, the other door now accounts for 999/1000 of the original options.  
Oddly, a new person who now arrives and is presented the choice without that prior information, would calculate the probability at 50:50. The probabilities are a calculation based upon what you know. The calculation would be wrong because an important assumption in calculating probabilities - that the car and goat were randomly, normally distributed between two doors - is wrong. A third door has been unrandomly eliminated.  


Or you could just experiment.  
So you ''should'' switch doors. You exchange a 1/3 chance of being ''right'' for a 1/3 risk of being wrong. This proposal outrages some people, at first. Apparently, even statisticians. But it is true.
 
It is easier to see if instead there are ''one thousand'' doors, not three, and after your first pick the host opens 998 of the other doors.
 
Here you know you were almost certainly wrong first time, so if every possible wrong answer but one is revealed to you it stands more obviously to reason that the other door which accounts for 999/1000 of the original options, is the one holding the car.
 
Lesson: use what you already know about history, and your place in it, to update your choices. This ought not to be such a revelation. Count cards. Update your predictions and become a “super forecaster”.
 
====Bayesian probabilities are models ====
{{Drop|N|ow, all of}} this is well and good and unimpeachable if the [[nomological machine|conditions]] in which probabilities hold are present: a static, finite “sample space” — 3, 10 or  1000 doors — a finite and known number of discrete outcomes — goat or car — and a lack of intervening causes like moral (immoral?) agents who can capriciously affect the random outcomes.
 
It works well for carefully controlled games of chance involving flipped coins, thrown dice, randomly drawn playing cards and, of course ''Deal or No Deal''. They are all simple systems, easily reduced to “[[nomological machine]]s”
 
When you apply it to unbounded complex systems involving, well, people, it works less well.  


====Bayesian probabilies are probabilities ====
Now all of this is well and good and unimpeachable if the nomological conditions for probabilities hold. There needs to be a static, finite sample space — 1000 doors —and a finite and known number of discrete outcomes — goat or car. It also works for coins dice, cards and games of chance. These are simple systems, easily reduced to nomological machines
====The doomsday problem ====
====The doomsday problem ====
Bayesian probabilities are a clever way of deducing, [[a priori]] , that we are all screwed. If you find yourself at or near the beginning of something, such as Civilisation, a bayesian model will tell you it will almost certainly end soon.  
Bayesian probabilities, if misused, can lead statistics professors to the [[a priori]] deduction that we are all screwed.  
 
{{Quote|
{{D|A priori||adj|}}
Following logically from existing premises. Necessarily so. Not dependent on observation or falsifiable evidence.}}
 
Where it is not possible to gather the necessary evidence, philosophers have a weakness for ''[[a priori]]'' arguments. They are prevalent in metaphysical enquiries: Pascal ’s wager, cogito, ergo sum, the argument from design. Any argument based purely on probabilities is a priori: the general principle is extrapolated to predict a factual answer. A specific
If you find yourself at or near the beginning of something, such as Civilisation, a bayesian model will tell you it will almost certainly end soon.  


It works on elementary probability and can be illustrated simply.  
It works on elementary probability and can be illustrated simply.