Science from both sides – the INTERVAL study

As a regular blood donor, I was intrigued when I was invited to take part in a study on the effects of blood donation frequency. Apparently there is not much solid data on what blood donation intervals (between donations) are safe for the donor. And the recommended guidelines differ significantly around the world.

The INTERVAL trial assessed the effects of different  blood donation intervals. Participants, over 45,000 of them, were randomised to 8, 10, or 12 week intervals for men, and 12, 14, or 16 weeks for women for two years (I was an “8-weeker”). The results have now been published in the Lancet, and make for interesting reading.

The first finding (and I have to say I didn’t realise this was even one of the study aims) was that increasing the frequency also increased the amount of blood donated significantly. Adherence to the study was good, and participants also donated much more than they had in the past two years.

The impact on the health of the participants was what interested me most though. There wasn’t any change in self-reported general wellbeing measures. But “more frequent donation resulted in more donation-related symptoms (eg, tiredness, breathlessness, feeling faint, dizziness, and restless legs, especially among men…)”. And additional donations also led to “lower mean haemoglobin and ferritin concentrations, and more deferrals for low haemoglobin”. So donating very frequently isn’t exactly good for you, which does make sense.

So I was happy to take part, and pleased to read the results. From a quick read of the Lancet article, this seems like a well-designed and analysed study, and importantly large enough to provide robust results on an important topic. If only more of medical science was like this (or indeed any science about humans…).

Advertisements

Irony is still alive

It shouldn’t come as a surprise that psychological studies on “priming” may have overstated the effects. It sounds plausible that thinking about words associated with old age might make someone walk slower afterwards for example, but as has been shown for many effects like this, they are nearly impossible to replicate.

Now Ulrich Schimmack, Moritz Heene, and Kamini Kesavan have dug a bit deeper into this, in a post at Replicability-Index titled “Reconstruction of a Train Wreck: How Priming Research Went off the Rails”. They analysed all studies cited in Chapter 4 of Daniel Kahneman’s book “Thinking Fast and Slow”. I’m also a big fan of the book, so this was interesting to read.

I’d recommend everyone with even a passing interest on these things to go and read the whole fascinating post. I’ll just note the authors’ conclusion: “…priming research is a train wreck and readers […] should not consider the presented studies as scientific evidence that subtle cues in their environment can have strong effects on their behavior outside their awareness.”

The irony is pointed out by Kahneman himself in his response: “there is a special irony in my mistake because the first paper that Amos Tversky and I published was about the belief in the “law of small numbers,” which allows researchers to trust the results of underpowered studies with unreasonably small samples.”

So nobody, absolutely nobody, can avoid biases in their thinking.

Guest blog on “Academic collaboration – a startup point of view”

I was asked to write a guest blog to University College London Centre for Behaviour Change‘s Digi-Hub. My brief was to talk about collaboration between businesses and academia, in particular from the point of view of a small startup company like Club Soda.

My post, which is part of a longer series of guest blogs, deals with evidence, evaluation, and the tension that working across organisational boundaries can create.

You can read the post here.

Guest blog on “Behaviour change for pubs and bars”

I was asked to write something for the Society for the Study of Addiction about our Nudging Pubs work in changing the behaviour of pubs and bars.

My guest post was on the two theoretical foundations of our project: a taxonomy of behaviour change tools, and a typology of nudges. The first is a UCL-led project, the second is from Cambridge University’s Behaviour and Health Research Unit.

Read the post at SSA’s website.

A typology of nudges

We’re working on an assessment tool to use with pubs and bars. The tool is meant to measure how welcoming the venues are to their non-drinking (or “less-drinking”) customers. We have been pondering all the various factors we could include in the tool, and how to classify them.

Having met some people from the Behaviour and Health Research Unit (BHRU) at Cambridge, they pointed me to their paper “Altering micro-environments to change population health behaviour: towards an evidence base for choice architecture interventions” in BMC Public Health. It could just help us get some of our ideas in order too.

The article has a nice typology for “choice architecture interventions in micro-environments”; I’ll just call them nudges from now on. There are nine types of nudges in this scheme:

    • Ambience (aesthetic or atmospheric aspects of the environment)
    • Functional design (design or adapt equipment or function of the environment)
    • Labelling (or endorsement info to product or at point-of-choice)
    • Presentation (sensory properties & visual design)
    • Sizing (product size or quantity)
    • Availability (behavioural options)
    • Proximity (effort required for options)
    • Priming (incidental cues to alter non-conscious behavioural response)
    • Prompting (non-personalised info to promote or raise awareness)

The first five types change the properties of “objects of stimuli”, the next two the placement of them, and the final two both the properties and placement.

I can see how we could use this as a basis for our thinking on the factors we want to measure pubs and bars on. For example, some basics like the choice of non-alcoholic / low-alcohol drinks would be about Availability, display of non-alcoholic drinks could be Presentation, Proximity and also Priming, drinks promotions would be Prompting and Labelling, and staff training could perhaps be about Prompting too?

I can’t instantly think of anything that we couldn’t fit into the typology (although we might need some flexibility of interpretation!). Interestingly, when the Cambridge researchers reviewed the existing literature, they could only find alcohol related nudges of the ambience, design, labelling, priming and prompting types. And not many studies overall, especially compared to research on diet which was the most popular topic for these types of nudges.

On the other hand, we could probably also find at least one metric for every one of the nine types of nudges, but they might not be the most interesting or important ones for this project. But it could still be a useful exercise to go through.

Progress with p values – perhaps

The American Statistical Association (ASA) has published their “statement” about p values. I have long held fairly strong views about p values, also known as “science’s dirtiest secret”, so this is exciting stuff for me. The process of drafting the ASA statement involved 20 experts, “many months” of emails, one two-day meeting, three months of draft statements, and was “lengthier and more controversial than anticipated”. The outcome is now out, in The American Statistician, with no fewer than 21 discussion notes to accompany it (mostly people involved from the start as far as I can gather).

The statement is made up of six principles, which are:

  1. P-values can indicate how incompatible the data are with a specified statistical model.
  2. P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.
  3. Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.
  4. Proper inference requires full reporting and transparency.
  5. A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.
  6. By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.

I don’t think many people would disagree with much of this. I was expecting something a bit more radical – the principles seem fairly self-evident to me, and don’t really address the bigger issue of what to do about statistical practice. That question is addressed in the 21 comments though.

It probably says something about the topic that it needs 21 comments. And that’s also where the disagreements come in. Some note that the principles are unlikely to change anything. Some point out that the problem isn’t with p-values themselves, but the fact that they are misunderstood and abused. The Bayesians, predictably, advocate Bayes. About half say updating the teaching of statistics is the most urgent task now.

So a decent statement as far as it goes, in acknowledging the problems. But not much in the way of constructive ideas on where to go from here. Some journals have banned p-values altogether, which sounds like a knee-jerk reaction in the other extreme direction. I’d just like to see poor old p’s downgraded to one of the many statistical measures to consider when analysing data. Never the main one, and definitely not the deciding factor on whether something is important or not. I may have to wait a bit longer for that day.