少点错误 22小时前
Why I am not a polygenic score nihilist
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了基因变异对重要结果的影响,特别是关于教育程度。作者认为,与其关注具有已知问题的R方统计量(即遗传力),不如关注基因变异的效应大小。文章通过假设的例子和实际数据,说明了基因对教育的影响,并强调了区分因果效应和相关噪声的重要性。通过对兄弟姐妹的研究,文章揭示了基因对个体差异的真实影响。

🎓 R方(遗传力)衡量的是一个变量解释另一个变量的变异比例。文章通过两个例子说明了R方并不能完全反映变量的影响大小。即使R方较低,变量的平均效应也可能保持不变。

📊 作者展示了教育程度多基因评分(PGS)与大学入学率的关系。数据显示,PGS最高的十分之一人群中,近50%的人上了大学,而PGS最低的十分之一人群中,这一比例不到20%。这表明PGS与教育程度之间存在显著关联。

👨‍👩‍👧‍👦 文章强调,区分PGS的真实效应和相关环境因素至关重要。通过研究兄弟姐妹,可以更准确地衡量基因的因果效应。因为兄弟姐妹之间的基因差异是随机分配的,任何系统性的环境差异都可能与他们的基因有关。

💡 研究表明,在考虑了共享环境因素后,PGS对教育的影响仍然存在。尽管效应减小,但仍具有重要意义,揭示了基因对教育的真实影响。

Published on July 3, 2025 1:38 PM GMT

This is partly in response to Scott Alexander's coverage of the "new anti-hereditarians", roughly Eric Turkheimer and Sasha Gusev. My basic take is that we should focus on the effect size of genetic variation on important outcomes, rather than on heritability, an R-squared statistic which has well-known problems. Excerpts:

The R-squared is the proportion of variation in one thing explained by another. Here’s two examples using data I made up on my computer.

 

The first shows a highly unrealistic polygenic score (PGS), which perfectly explains educational attainment, measured as the age of leaving education. If you have a PGS of -2, you leave school at 16. If you have a PGS of 2 you leave school at 20. A person’s educational attainment is always exactly his polygenic score plus 18, with no environmental variation at all. The R-squared of the PGS is 100 per cent: it explains all the variation in educational attainment. The black dots show fifty people’s PGS and educational attainment; every single dot fits exactly on the red regression line, which shows the relationship between the PGS and education.

Now here’s a slightly less unrealistic polygenic score. This PGS is not so good at explaining people’s educational attainment: some of the people are above or below the red line. The R-squared is only 25%; three quarters of the variation in educational attainment is down to “the environment” (whatever that means — more on this later).

 

So, this polygenic score is a lot worse than the other one in terms of predicting education.

But if you look at the slope of the red line, there’s a surprise: it is exactly the same in both pictures. For both of these polygenic scores, if you score -2 you leave education at 16 on average, and if you score 2 you leave at 20 on average. There is a lot more random variation in the second picture, but the average effect of the polygenic score is the same.

In other words, the size of a variable’s effect is unrelated to the amount of variation it explains. This is not news to statisticians....

Now let’s see some real data, from our trading genetics paper (simply because I have it to hand).

 

Here we’ve divided the sample up by deciles of the polygenic score for educational attainment, and simply plotted the proportion of each decile that went to university.

Less than 20% of the bottom decile went to university. Almost 50% of the top decile did. These differences are huge.

Polygenic score skeptics will correctly point out that not all of the difference is caused by the polygenic score! People with different PGS have parents with different PGS, live in different neighbourhoods, and have many other aspects of their environment that correlate with their PGS. This is true and very important to understand, and the work done recently to separate out causal effects from correlated noise is also very important.

We can measure the true effect of this PGS by looking at pairs of siblings. Because people’s genes are randomly allocated from their parents’ genes, differences between siblings are a true natural experiment. That doesn’t mean that siblings with different PGS won’t have different environments. They will! But any systematic differences in their environments will be — must be — caused by their genetics, interacting with their parents, their school environment, et cetera. Between siblings, the genetic differences come first in the causal chain.

When we do this in our sample, about half the effect of the PGS goes away. That’s pretty typical, and roughly fits the discovery that half of the differences in individual PGS’s are themselves capturing things about the shared environment.

The remaining effect is the true causal effect of the PGS. It’s now smaller, but is it small?...



 



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

基因 教育 遗传力 效应大小 多基因评分
相关文章