少点错误 2024年10月02日
Likelihood calculation with duobels
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章介绍了一种更直观的计算概率的方法。通过使用杜贝尔(duobel)替代分贝来编码可能性信息,我们可以更轻松地在日常生活中进行概率推理。文章还通过多个实例展示了如何应用这种方法,并解释了如何根据测试结果进行可能性的增减计算。

杜贝尔是一种替代分贝的编码可能性信息的方法,Log2代替log10。每个杜贝尔数值的变化会使赔率比减半或加倍。如0是1:1,正数的赔率为2ⁿ:1,负数的赔率为1:2ⁿ。

通过计算赔率比xx+y可将杜贝尔数值转化为赔率比或比率。以信息位的形式描述可能性的变化,增加为正数,减少为负数。如增加因子n等于x个信息位,n:1是杜贝尔值x的赔率比。

文章通过多个实际案例展示了如何应用杜贝尔方法计算概率,如女性乳腺癌症筛查、对乳腺癌症的多种测试、机械故障判断等情况。

这种计算概率的方法的优点是在日常生活中易于操作,涉及的计算不多,具有很强的实用性。

Published on October 1, 2024 4:21 PM GMT

The post An Intuitive Explanation of Bayes's Theorem explains that using Bayesian reasoning we can much better see how likely something is (by calculating conditional probabilities), and thus update our own beliefs as efficiently as possible.
However, the essay shows us how to calculate those probabilities using lots of math (duh!). Which is fine if you just want to grasp the concept, or need to evaluate a medical study or something.

But what about applying it in our daily lives? Nobody has time for divisions while evaluating evidence...
The article gives 2 hints to help. The first using odds ratios, the second using decibels. But odds ratios keep getting multiplied and thus quickly get unwieldy, while decibels would be great, if not for the fact that I have no clue what the log10 values of different ratios are, and I really dislike the thought of memorizing them.

I think I can offer a more intuitive way to approach this. The key insight is: We don't need exact probabilities. In daily life, we are not in the position of a doctor calculating precise likelihoods from study data. We are taking gut level probabilities and want to continue reasoning with the results. So all we need are ballpark numbers anyway. We want to know whether something is 8% or 80% likely. The exact percentages are just noise though, our intuition isn't that precise.

I propose to encode the likelihood information using duobels, rather than decibels. Log2 instead of log10. Then every change of that number by one halves or doubles one side of the odds ratio.
The table for that would look like this:

duobelodds ratioratio%
-21:41/520
-11:21/333
01:11/250
+12:12/367
+24:14/580

(For in between values, we can use n.6 (.585) to get a x1.5 instead of a x2 increase: 0.6 is 3:2, -0.6 is 2:3, 1.6 is 3:1, and so on.
n.2 is 1.15, n.4 is 1.3, n.8 is 1.75 -> this is in most cases already too precise for daily needs)

All we need to remember to calculate the ratios in our head is this:

That's it, we can translate any duobel number into an odds ratio or a ratio! And percentages for stuff like 2/3 we have already memorized, if we even need percentages in the first place.


Now let's talk likelihood shifts, where we can take advantage of the logarithmic measure of likeliness we just established.
In How Much Evidence Does It Take?  Eliezer describes hypothesis updates in terms of bits of information, and I very much like that way of thinking about it. It is nice and intuitive. Every bit is one coin-flip, a halving or doubling of likelihood (one side of the odds ratio, to be more precise). That is something we can use to ballpark how likely we think something is.

For duobels we can just add those shifts to them, and we have the new likelihood! Just like the original essay explained with decibels, just way more intuitive: We can easily translate duobels to ratios in our heads, and the shifts are just the number of information bits.

Obviously we'll need a wider range of duobel values than just from -2 to +2, so here is a bigger table, from -10 (1:1000) to +10 (1000:1).

duobelodds ratioratio% duobelodds ratioratio%
01:11/250 01:11/250
-11:21/333 +12:12/367
-21:41/520 +24:14/580
-31:81/911 +38:18/989
-41:161/176 +416:116/1794
-51:32...3 +532:1...97
-61:64 1.5 +664:1 98.5
-71:128 0.78 +7128:1 99.22
-81:256 0.39 +8256:1 99.61
-91:512 0.19 +9512:1 99.81
-101:1024 0.10 +101024:1 99.90

Anything below -4 or above +4, we can just estimate the ratio as 1/ or (-1)/ (so 1/ away from 1) respectively. In other words: Outside of that middle section, all we really need to know are the powers of 2.
Below -4, every shift by 10 points changes the percentage by a factor of 1000 (because  is  is 1024), above +4 it changes the distance from 100% by that same factor (shifting by 6.6 is a factor of 96, fairly close to 100).
Now if only our number system was base 16 (or any other power of 2) instead of base 10! Then this would work cleanly for every whole number change instead of just for multiples of 10. It would also mean we could just know the percentages below -4 or above +4 without need for calculation...

Alright, we've got our tool, let's test it out, shall we? We'll just go through the examples given in An Intuitive Explanation of Bayes's Theorem.

Q: 1% of women at age forty who participate in routine screening have breast cancer. 80% of women with breast cancer will get positive mammograms. 9.6% of women without breast cancer will also get positive mammograms. A woman in this age group had a positive mammogram in a routine screening. What is the probability that she actually has breast cancer? 
A: An initial likeliness of 1% is -6.6 duobel (1:96). The likeliness increase of a positive mammogram is 8.3 (80/9.6), so +3 bits. The result is -6.6+3 = -3.6 duobel (1:12, 1/13), 7.7% (rounding to 8% is better, we aren't that precise). The text gives an answer of 7.8%.

We'll skip the blue eggs with pearls example, since it's not about shifting our expectations in response to new data.

Q: We’ll suppose that the Tams-Braylor gives a true positive for 90% of patients with breast cancer, and gives a false positive for 5% of patients without cancer. Let’s say the prior prevalence of breast cancer is 1%. If a patient gets a positive result on her mammogram and her Tams-Braylor, what is the revised probability she has breast cancer? 
A: The likeliness increase of the Tams-Braylor test is 18 (90/5), so +4 bits. Our result is thus -6.6+3+4=0.6 duobel (3:2, 3/5), 60%. The text gives an answer of 60% as well.

Q: Suppose that the prior prevalence of breast cancer in a demographic is 1%. Suppose that we, as doctors, have a repertoire of three independent tests for breast cancer. Our first test, test A, a mammography, has a likelihood ratio of 80%/9.6% = 8.33. The second test, test B, has a likelihood ratio of 18.0 (for example, from 90% versus 5%); and the third test, test C, has a likelihood ratio of 3.5 (which could be from 70% versus 20%, or from 35% versus 10%; it makes no difference). Suppose a patient gets a positive result on all three tests. What is the probability the patient has breast cancer? 
A: The likeliness increase of the 3. test C is 3.5, so +2 bits. Our result is thus -6.6+3+4+2=2.6 duobel (6:1, 6/7), 86% (rounding to 85% is better, we aren't that precise). The text gives an answer of 84%.

Let's stop for a moment here and answer an obvious question: How do we get the likeliness decrease of a negative result? Say, if the third test C is 70% vs 20% and does not indicate breast cancer? 
While the likelihood increase for a positive result is 70%/20%=7/2=3.5, the likelihood decrease for a negative one is (100%-20%)/(100%-70%)=8/3=2.7. This is the likelihood of a true negative (the test correctly fails to report that you have it), divided by the likelihood of a false negative (how likely it is somebody has breast cancer but the test does not detect it). So the result is roughly 1.6 bits of information (3:1, 3/4, 75%). That's the amount to subtract from the previous duobel value.
If the first 2 tests said yes, and test C said no, the overall likelihood of breast cancer is -6.6+3+4-1.6=-1.2, or about -1 duobel (1:2, 1/3), 33%.

While for positive results (test says hypothesis is likely), we ask ourselves "How many times more likely was that test to succeed if this hypothesis is true than if it was false?", and then add that many bits,
for negative results (test does not say hypothesis is likely) we ask ourselves "How many times more likely was the test to fail if this hypothesis is false than if it was true?", and then subtract that many bits.

Q: You are a mechanic for gizmos. When a gizmo stops working, it is due to a blocked hose 30% of the time. If a gizmo’s hose is blocked, there is a 45% probability that prodding the gizmo will produce sparks. If a gizmo’s hose is unblocked, there is only a 5% chance that prodding the gizmo will produce sparks. A customer brings you a malfunctioning gizmo. You prod the gizmo and find that it produces sparks. What is the probability that a spark-producing gizmo has a blocked hose?
A: An initial likeliness of 30% is -1 duobel. The likeliness increase due to sparks is 9, so +3 bits. Our result is thus -1+3=2 duobel (4:1, 4/5), 80%. The text gives as answer "(45% × 30%)/(45% × 30% + 5% × 70%)", which resolves to 79.4%.

The great thing about this approach to calculating probabilities is, that there really isn't much calculating involved at all (if you remember the odds of a duobel value)! Which means that it is perfectly doable in daily life.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

杜贝尔 概率计算 日常生活应用 可能性信息编码
相关文章