少点错误 1小时前
Histograms are to CDFs as calibration plots are to...
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了概率校准的可视化方法,特别是针对样本量较小的情况。文章指出,传统的直方图在样本量少时,其分箱选择会显著影响可视化结果,并引入了累积分布函数(CDF)的概念来解决这个问题。作者提出了一种基于CDF的校准图,通过积分“置信度小于x的预测数量”与“预期置信度小于x的预测数量”之差来构建,以更准确地评估预测的校准程度。最后,作者还讨论了该方法在实际应用中的优缺点。

📊 传统的直方图在样本量较少时,其分箱选择会影响可视化效果,从而导致信息丢失和不美观。

📈 累积分布函数(CDF)可以解决直方图的问题,因为它精确地表示了每个数据点,没有自由参数。

💡 作者提出了一种基于CDF的校准图,通过积分“置信度小于x的预测数量”与“预期置信度小于x的预测数量”之差来构建。

🤔 在基于CDF的校准图中,曲线向上表明概率过高,曲线向下表明概率过低。

✅ 这种新的可视化方法能够更准确地评估预测的校准程度,尤其是在样本量较小的情况下。

Published on June 5, 2025 8:20 PM GMT

As you know, histograms are decent visualizations for PDFs with lots of samples...

10k predictions, 20 bins

 

...but if there are only a few samples, the histogram-binning choices can matter a lot:

10 predictions, 4 bins
same 10 predictions, 7 bins

The binning (a) discards information, and worse, (b) is mathematically un-aesthetic.

But a CDF doesn't have this problem!

same 10 predictions, every data point precisely represented

If you make a bunch of predictions, and you want to know how well they're calibrated, classically you make a graph like this:

source: SSC's 2019 prediction grading

But, as with a histogram, this depends on how you bin your predictions.

100 predictions, 10 bins
same 100 predictions, 30 bins

Is there some CDF-like equivalent here? Some visualization with no free parameters?


I asked that question to several people at Arbor Summer Camp. I got three answers:

    "You get from a PDF to a CDF by integrating. So, here, analogously, let's integrate (num predictions with confidence < x that came true) minus (expected num predictions with confidence < x that came true)."(the same thing, said in different words)(the same thing, said in different words)

If we make a "CDF" for the above 100 predictions by applying these three insights, we get:

.py

I find this a little harder to read than the calibration plots above, which I choose to interpret as a good sign, since CDFs are a little harder to read than histograms. The thing to keep in mind, I think, is: when the curve is going up, it's a sign your probabilities are too high; when it's going down, it's a sign your probabilities are too low.

Test: how would you describe the problems that this predictor has?

Solution.

 

(Are there any better visualizations? Maybe. I looked into this a couple years ago, but looking back at it, I think this simple "sum(expected-actual predictions with p<x)" graph is at least as compelling as anything I found.)



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

概率校准 CDF 可视化 直方图 数据分析
相关文章