The peculiarities of randomised controlled trials in education

For as long as anyone can remember, there have been calls in educational research to do more randomised controlled trials. The RCT is often heralded as the gold standard for the field. One reason for this is that educationalists believe that the rise of RCT methodology (beginning in force in the 1940s) is what created modern medicine, a far more successful discipline than education has ever been. It is obvious enough what potential RCTs have for education. They are underused and we lack the resources and and expertise in education to put that right quickly. I am all for RCTs. But there are deep differences between the application of RCTs in medicine and education that means that this isn’t the fix that will finally modernise education or bring it in line with other more ‘scientific’ endeavours. Education is a speciality all of its own. Here are four big reasons why RCTs are different in education than elsewhere…

Flowchart of the Phases of a Randomised Controlled Trial.
Adapted by PrevMedFellow from CONSORT 2010. CC BY-SA 3.0

(1). In education, RCTs are the research, full stop

Medical scientists usually implement RCTs to test how we can apply what is already known. It tests an intervention in practice that we already have theoretical reasons for. An RCT is conducted at the end of a project to check that the benefits of the intervention outweigh its unwanted side-effects. The RCT may lead to advances in theoretical knowledge, often accidentally, but it is not the primary method of uncovering causal mechanisms and increasing medical understanding. We have other routes and methods to do that (usually).

It is very different in education. Educationalists design a research study from the beginning to be an RCT. It isn’t an add-on. It isn’t about finding out whether there are any unintended side-effects. It isn’t finding out whether our theory works in practice. In education, our RCT is our primary evidence. It is what tells us that an intervention is any good. Our theoretical knowledge is based on this, it is not prior to our tests.

We may have good reasons for thinking that an intervention will work in education-we may even call these reasons ‘theory’. And we may point to other evidence that supports this theory. But this evidence is largely of a very similar kind to the RCT (but without the robustness of the conditions): we compare (quantitatively or qualitatively) what the world is like with or without an intervention. In the crudest of cases, we ask someone what the intervention brought about. All of this can be very good evidence, but it is not of a different kind to the RCT, it is just a weaker version of it, a pilot run of the RCT. In medical science, the theory connects our intervention to a wealth of evidence of different kinds. We solve an issue in medicine from many lines of attack. The RCT is strong because it is supported by a history of many different kinds of experiments and theoretical gains.

(2) In education, we don’t deliver the same jab in everyone’s arm

Medical interventions do not work on everyone in the same way, which is why we have to turn to the RCT to look at how well an intervention works broadly. The same is true of education: children respond differently to interventions. But there is another spanner in the works in education. The medicine itself is different every time we deliver it.

Again, I’m simplifying. There are treatments in medicine that are not always identically delivered. There are some interventions that can perhaps be homogenised fairly well in education. These tend not to be pedagogical initiatives, which are highly dependent on the teacher. We might think of ‘extending the school day to run from 8.30 to 4.00’ as an identical intervention for all schools, but we still might suspect that it’s not just the environment and context that makes the intervention have different results, but the way it is implemented. In education, we don’t just have to consider the differences in the places in which we are applying an intervention (which is primarily what the RCT is designed to deal with) but the differences in the intervention itself (which the medical RCT was not originally designed to tackle).

(3) In education, knowledge from interventions is incredibly fleeting

One treatment is (more usually) replaced in medicine because we have found a better one, and not (usually) because it is no longer effective because the world has changed. We do design a new flu jab each year, but this is a minor change in comparison to what happens in education, where an intervention is very much stuck in the educational culture and times in which it is applied.

Progress in education is a very different thing than in science because it is not so obviously based on timeless knowledge. It doesn’t make sense any more for us (in the UK) to consider the impact of caning, of educating girls with boys, of educating all 15-year-olds, or of providing the opportunity to gain the same qualifications to as many children as possible. Such advances in education were not truly based on knowledge but on values. The changes they have wrought means that interventions that used to be meaningful no longer are. We don’t replace them because we have better interventions. They are just not relevant, right or valid. The medical world sees radical change in this way too, of course, but it is able to progress more linearly (even if not entirely so) for more of the time. An old flu vaccine is still recognised as an intervention. Standing a child in a corner with a “DUNCE” sign around their neck (my gran had messed up her sewing) is not.

(4) In education, we have a bigger mission that we don’t know how to measure

An RCT requires a measurement of sorts in order to compare different treatment groups. In medicine, this is a (relatively) simple matter (usually). We want to know if the disease or condition goes away or subsides, how long the patient lives for, etc. At first sight, it looks similarly simple in education: how well does the child do in examinations? The test score is the substitute measure we use for RCTs. And it simply won’t do. Not in the long run. Education (staff told me once in a survey I conducted as a trainee teacher) is about “preparing children for life”. We can’t measure test scores and assume this as a proxy for life-preparedness. We have a lot to do in education apart from qualifications.

The irony here is that, although RCTs in medicine are used to identify unwanted side-effects, in education they can be contributing to them. There are huge side-effects of driving education by examination results and we only have crude ways of incorporating this into our RCTs (usually, by conducting interviews alongside our quantitative measures). The ethical dimension of RCTs in education is therefore profoundly different in the long-term precisely because measuring what we care about is a trickier issue. This limitation should be written in bold, capital letters more often all over the place in education. We take the RCT from medicine without taking the mindset that goes with it: medical scientists are weighing up the benefits and the risks. Perhaps that mindset would be as important and useful (or more so) to us in education than the RCT itself. The RCT in medicine measures the risks, we have no comparable procedure in education that is sufficiently worked out that it has a name. That’s because we need to build it for ourselves instead of stealing it from another field.

Education: a thing of its own

This is all just to say, of course, that the revolutions of education are not likely to be of the same kind of medicine. RCTs are needed in education very badly and we should do more of them. But we must recognise the challenges of our field by modifying RCT methodology and admitting our limitations very loudly. Educational RCTs aren’t going to improve life outcomes for children in the same way that medical ones have increased the quality and quantity of life for so many.