Published on July 21, 2025 4:18 PM GMT
I recently came across a 2024 update on a 2018 book making the still-controversial case that hormone replacement therapy (HRT) after menopause is highly beneficial and that rumors of its risks are unfounded, or at least highly exaggerated.
According to their narrative (apparently not contested), HRT started with the physiologically reasonable idea that menopausal symptoms could be treated with estrogens (which at the time were extracted from pregnant horse urine). Early observational epidemiological studies reported that taking estrogen during menopause was associated with considerable benefits and minimal risks, leading to its widespread use by menopausal women. Then a huge randomized controlled trial (RCT) by the Women's Health Initiative (WHI) famously overturned that prevailing wisdom, debunking the purported benefits and establishing considerable risks. Publication of those conclusions in 2002 led to a drastic reduction in the use of HRT. This segment of the history of HRT became a stock example of the perils of junk science: the fallacy of drawing conclusions from anecdotes or mistaking correlation with causation; the triumph of RCTs in exposing the real truth; and the curious resistance of people to updating their beliefs in the face of iron-clad evidence to the contrary.
However, the aforementioned book goes on to say (controversially) that the conclusions of the famous WHI study were subsequently proven to be completely wrong. The bulk of the book is spent detailing the reasons these authors believe the previous "myth debunking" claims were themselves based on junk science: statistical errors, p-hacking, over-interpretation of preliminary results, selective reporting, making claims on the basis of non-significant results, exaggerating clinical significance due to the base rate fallacy or reporting relative risks instead of absolute risks[1]. In summary they argue that WHI's report reflected mostly their biases. The book claims that the WHI's own dataset does not support their claims, and numerous subsequent studies consistently reproduce benefits of HRT, and fail to reproduce risks. To their reading, the data squarely favor HRT as net beneficial for most women. Since 2002 the WHI has moderated its claims, but still stands by the main conclusions. Unfortunately, I don't have confidence that the book authors are free of bias, either. To make matters more complicated, there are accusations of ethical impropriety against both the opponents and proponents of HRT.
When both sides of a debate accuse the other of bias and bad science, it can be hard to tell which experts are the junk scientists and which are the bastions of rationality[2]. In this case, there are people with serious credentials reaching opposite conclusions, each saying they are only committed to the truth, each citing scientific data which, if true, would seem to support their conclusions. Their arguments are not transparently irrational or unscientific. If one read either side's account alone, it would sound extremely convincing. I have not had time to dig into the primary literature or formal critiques and rebuttals, but it seems like that would be minimally required to sort this out; diving into the datasets would be even better[3].
I wondered whether this question had been discussed on LessWrong, and found that it was discussed briefly, over a decade ago[4]. The discussion elicited interest on two levels: (1) Abstractly, as a case study in how to make rational decisions about medical/health claims in the face of conflicting claims, insufficient evidence, snake-oil-purveyors, etc.; and (2) Concretely, as in "so what's the truth about HRT?" I take it that the first was the intended topic, but discussion of the second is partly informative in working out the first. At that time the views here seem to have been mostly aligned with the prevailing narrative that WHI was the voice of reason (HRT is dangerous and bad), and some spoke as if this were an open-and- shut case.
I nominate this topic for a re-boot as a case study in medical inference, scientific methodology, and science communication. It already has the status of a paradigm example of bad science. There are over a century of data of various kinds, an extensive discussion of scientific and inferential methodology, with multiple rounds of back-and-forth critique[5], and yet credible experts are still taking opposite positions on the basis of the same factual evidence. I am not confident that any account of the evidence is unbiased.
This could be an interesting test case for discussing:
- The complexity of the process by which scientific research iteratively refines what we take to be true, individually and as a scientific community The intrinsic strengths and limitations of different scientific and statistical methodsThe potential for bias to influence conclusions, the psychology and sociology behind this tendency, and what steps have been or can be taken to combat itThe incentives for sensationalism in peer-reviewed scientific publication as well as science communication directed at the public. The idea that vigorous debate (adversarial dialogue) promotes truth-finding due to each side exposing the weaknesses and flaws of the others' arguments (does it?)Epistemic humility: calibrating the confidence with which we hold and communicate our current understanding of facts, especially as intelligent, informed non-experts.
What has particularly struck me in reading on this topic is the degree to which statements of fact or logic are often interwoven with rhetorical style that serves to sway opinion by means other than reason. This can be as subtle as writing with a tone of definitive finality or condescension, or as blatant as ridiculing opponents' positions. I suspect writers are often unaware they are doing this, probably including myself.
My first thought is: it would be beneficial to regularly edit with an eye to detecting and eliminating such rhetorical devices, so that presentations of evidence and logic can be as epistemically clean as possible. Admittedly this could make the prose dry, as such devices are among the best tools for engaging interest. But striving to write this way could help keep the reader's mind -- and the writer's own thinking -- unbiased as they process information. On the other hand, those rhetorical devices are useful "tells" about a writer's biases, in the absence of which it would be even harder to detect the potential slant in their presentations of facts. Finally, one would not want to eliminate expression of value judgements or emotions; just be more explicit about when one is doing so, and keep it segregated from fact statements.
- ^
As a newcomer to your forum it appears the readership is well-educated on these points, but I don't know if you have standard go-to links for these concepts. LMK if citations or links are wanted here.
- ^
I notice a tendency to feel that the most recent "debunking" is most likely to be correct, but on reflection this is clearly not a good criterion. Inasmuch as each round of a debate takes into consideration and responds to all the evidence and arguments of all previous rounds in an objective, dispassionate way, this would be true. But it's possible any rebuttal is driven in part by attachment to a prior position, leading to a slanted re-assessment, in ways that are not necessarily obvious to a reader.
- ^
Would it be feasible for all the data of all the past studies to be de-identified, blinded, and combined into a huge dataset, such that people could analyze it using whatever stratifications or modifiers they wanted to, but commit to their conclusions before knowing which group was the placebo group? Or is that intrinsically impossible?
- ^
Examples to this effect on LessWrong include:
Dealing with the high quantity of scientific error in medicineSequence Announcement: Applied Causal Inference
Why we need better science, example #6,281
and the comment threads thereof
- ^
Discuss