少点错误 2024年09月19日
Intention-to-Treat (Re: How harmful is music, really?)
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

意向性治疗分析(ITT)是一种统计方法,用于处理研究中常见的非依从性问题,即参与者未能按照计划执行研究方案。本文以音乐对情绪影响的研究为例,展示了ITT分析如何帮助研究人员更准确地评估干预措施的真实效果,并避免混杂因素带来的误导性结果。

🤔 **非依从性问题:** 在研究中,参与者可能未能按照计划执行研究方案,例如,在音乐对情绪影响的研究中,参与者可能没有按照计划每天听音乐。这种非依从性会导致研究结果的偏差,因为参与者实际接受干预的程度与随机分配的干预方案不一致。

💡 **意向性治疗分析:** ITT分析通过将分析集中在最初的随机分配方案上,而不是实际接受干预的人群,来解决非依从性问题。这意味着,即使参与者没有按照计划执行研究方案,他们仍然被归类为他们最初被分配到的干预组。

📊 **研究结果:** 在音乐对情绪影响的研究中,ITT分析发现,即使音乐在某些情况下对情绪有轻微的影响,但总体而言,音乐对情绪的影响并不显著。这表明,先前研究中观察到的音乐对情绪的积极影响可能是由于混杂因素造成的,而不是音乐本身的效果。

⚠️ **ITT分析的局限性:** ITT分析虽然可以解决非依从性问题,但它也存在一些局限性。例如,ITT分析可能会低估干预措施的真实效果,因为它将所有参与者都包含在分析中,而没有考虑他们实际接受干预的程度。

🌟 **结论:** ITT分析是一种重要的统计方法,可以帮助研究人员更准确地评估干预措施的真实效果。在处理非依从性问题时,研究人员应考虑使用ITT分析,并权衡其利弊。

Published on September 18, 2024 6:44 PM GMT

I have long wanted to write about intention-to-treat because it's such a neat idea, and the recent article How harmful is music, really? spurred me to finally do it.


The reported results were

DayMean mood
Music0.29
No music0.22

Making some very rough assumptions about variation, this difference is maybe 1–2 standard errors away from zero, which on could be considered weak evidence that music improves mood.

Except!

There is one big problem with this approach to analysis. Although the experiment started off in a good direction with picking intended music days at random, it then suffered non-compliance, which means the actual days of music are no longer randomly selected. Rather, they are influenced by the environment – which might also influence mood in the same direction. This would strengthen the apparent relationship with no change in the effect of music itself.

The solution is to adopt an intention-to-treat approach to analysis.

Illustrating with synthetic data

I don’t have access to the data dkl9 used, but we can create synthetic data to simulate the experiment. For the sake of this article we’ll keep it as simple as possible; we make some reasonable assumptions and model mood as

This is a bit dense, but it says that our mood at any given time (gi) is affected by four things:

Here's an example of what an experiment might look like under this model. The wiggly line is mood, and the bars indicate whether or not we listen to music each day. (The upper bars indicate listening to music, the lower bars indicate no music.)

The reason we included the situation si as a separate term is that we want to add a correlation between whether we are listening to music and the situation we are in. This seems sensible – it could be things like

The model then simulates 25 % non-conformance, i.e. in roughly a quarter of the days we do not follow the random assignment of music. This level of non-conformance matches the reported result of 0.5 correlation between random music assignment and actual music listening.

When we continue to calibrate the model to produce results similar to those reported in the experiment, we get the following constants and coefficients:

The model then results in the following moods:

DayMean mood
Music0.29
No music0.20

We could spend time tweaking the model until it matches perfectly[2] but this is close enough for continued discussion.

The very alert reader will notice what happened already: we set , meaning music has no effect on mood at all in our model! Yet it produced results similar to those reported. This is confounding in action. Confounding is responsible for all of the observed effect in this model.

This is also robust in the face of variation. The model allows us to run the experiment many times, and even when we have configured music to have no effect, we get an apparent effect 99 % of the time.

With the naïve analysis we have used so far, the correlation between mood and music is 0.26, with a standard error of 0.10. This indeed appears to be some evidence that music boosts mood.

But it's wrong! We know it is wrong, because we set  in the model!

Switching to intention-to-treat analysis

There are two reasons for randomisation. The one we care about here is that it distributes confounders equally across both music days and non-music days.[3] Due to non-compliance, music listening days ended up not being randomly selected, but potentially confounded by other factors that may also affect mood.

Non-compliance is common, and there is a simple solution: instead of doing the analysis in terms of music listening days, do it in terms of planned music days. I.e. although the original randomisation didn't quite work out, still use it for analysis. This should be fine, because if music has an effect on mood, then at least a little of that effect will be visible through the random assignments, even though they didn't all work out. This is called intention-to-treat analysis.[4]

In this plot, the lighter bands indicate when we planned to listen to music, and the darker bands when we actually did so.

With very keen eyes, we can already see the great effect of confounding on mood. As a hint, look for where the bars indicate non-compliance, and you'll see how often that corresponds to big shifts in mood.

When looking at mood through the lens of when we planned to listen to music, there is no longer any meaningful difference.

DayMean mood
Music planned0.24
Silence planned0.23
  
Correlation0.03
Standard error0.03

Thus, when we do the analysis in terms of intention-to-treat, we see clearly that music has no discernible effect on mood. This is to be expected, because we set  after all, so there shouldn't be any effect.

The cost is lower statistical power

To explore the drawback of intention-to-treat analysis, we can adjust the model such that music has a fairly significant effect on mood. We will make music 4× as powerful as situation. 

This new model gives us roughly the same results as reported before when looking purely in terms of when music is playing:

DayMean mood
Music0.29
No music0.21

On the other hand, if we look at it through an intention-to-treat lens, we see there is now an effect (as we would expect), although too small to be trusted based on the data alone.

DayMean mood
Music planned0.26
Silence planned0.23
  
Correlation0.09
Standard error0.11

Remember that we constructed this version of the model to have a definitive effect of music, but because we are looking at it through an intention-to-treat analysis, it becomes harder to see. To bring it out, we would need to run the experiment not for 31 days, but for half a year!

Such is the cost of including confounders in one's data: they make experiments much more expensive by virtue of clouding the real relationships. Ignoring them does not make things better, it only risks producing mirages.

Brief summary of findings

To summarise, these are the situations we can find ourselves in:

Analysis typeSignificant effectNon-significant effect
NaïveActual or confounderActual
Intention-to-treatActualActual or confounder

In other words, by switching from a naïve analysis to an intention-to-treat analysis, we make confounders result in false negatives rather than false positives. This is usually preferred when sciencing.

  1. ^

    Actually, since the situation is based on days and there are six measurements per day, we might be able to infer this parameter from data also. But we will not.

  2. ^

    I know because we have something like 7 degrees of freedom for tweaking, and we only need to reproduce 5 numbers with them.

  3. ^

    The other purpose of randomisation is to make it possible to compute the probability of a result from the null hypothesis.

  4. ^

    This is from the medical field, because we randomise who we intend to treat, but then some subjects may elect to move to a different arm of the experiment and we can’t ethically force them to accept treatment.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

意向性治疗分析 非依从性 混杂因素 音乐 情绪
相关文章