We are oft to blame in this –
Tis too much proved – that with devotion’s visage
And pious action we do sugar o’er
The devil himself. – Hamlet
There is an unfortunate convergence of interests becoming apparent in the recent research gold rush which raises questions about the quality of the educational debate – or, more accurately, the quality of research input into the debate about where education should be going.
Consider, for example, the Education Endowment Foundation. The government has invested considerable funds via the EEF to run randomized controlled trials. One of the RCT evaluations recently released by the EEF was for a programme called Switch-On Reading. What is not apparent in the headline, but appears later in the report, is that the programme is in fact a repackaging of Reading Recovery, which is now being aimed at students at the transition between Key Stages 2 and 3.
Its provenance does not automatically make Switch-On Reading bad. The key question the study needs to answer is whether the intervention is effective enough to justify both financial cost and opportunity cost. Financially, the authors of the report calculate that it costs £672 per pupil to deliver. That’s expensive by some standards, depending on what we compare it with, but is cheap compared to the cost of a student leaving school unable to read fluently. In terms of opportunity cost, the programme consists of 20-minute tuition sessions daily over ten weeks, a total of 1000 minutes or about 16 hours. The 20 minutes a day can probably come out of form/tutor/registration time in the mornings. Ten weeks of that isn’t going to destroy anyone’s educational life chances. If the programme is effective, the costs are bearable.
So how effective is Switch-On Reading? The evaluators decided that it has an effect size of 0.24. What does this mean? The EEF tells us that an effect size of 0.6 can be compared to about a year’s progress. So this programme delivers about four months’ progress, in daily sessions over two and a half months. John Hattie, in his mega-meta-analysis, has suggested that the average effect size of an educational intervention is about 0.4, and argues that we should be focusing on interventions that have an impact above this threshold.
You might think from this that the intervention is less than mediocre and probably not worth implementing. But for some reason this is not the conclusion the evaluators reached. The EEF tells us that it had a ‘noticeable positive impact’ and ‘can be an effective intervention for weak and disadvantaged readers’. The authors justify this by pointing to the evidence that for some sub-groups, noticeably weaker readers, gains were above the average for the group. Unfortunately, the authors also admit that the sub-groups scores are statistically unreliable.
It’s difficult to know what to make of this. Are the evaluators following a different set of criteria to the rest of us? Is there something in the fine print that means an effect size of 0.24 is actually quite pronounced? Or is effect size, as some argue, an educational irrelevance which they are reporting but don’t really care about? Are the sub-group scores to be relied on anyway, even though they are unreliable?
In the absence of any other evidence, one must presume the answer to each of these questions is ‘no’. At which point, one turns to the context to interpret the anomalies. The obvious observation is that Reading Recovery is looking to extend its reach into the secondary school market. Renaming the programme ‘Switch-On Reading’ and emphasizing its phonics component (when Reading Recovery is the mother of all Whole Language programmes) suggests that marketing is an important consideration.
This is not mere cynicism. Robert Slavin, the director of the Institute for Evidence in Education, writes a blog for the Huffington Post. On 17 July 2013, Slavin interviewed a fellow member of the Investing In Innovation network, i3. The guest was Jerry D’Agostino, head of Reading Recovery’s i3-funded project. D’Agostino was invited to contribute his thoughts on sustaining the scale and longevity of the intervention. His comments make interesting reading. You can read the whole interview here, but these points are particularly pertinent:
It is important to ‘keep the brand fresh’.
It is important to ‘recognise that adaptations are needed’.
Despite these adaptations, ‘the framework is always the same’.
So it is reasonably clear that adjusting the programme to new trends in education, and adapting the ‘brand’, are standard strategies for Reading Recovery. Although Slavin says that 30-minute one-to-one lessons are non-negotiable, Switch-On Reading consists of 20-minute lessons. (Perhaps the reason for this ‘adaptation’ is that a 20-minute slot is easier for secondary schools to accommodate.)
But the most persistent feature of the way Reading Recovery is presented by its advocates is the claims about impact versus the evidence in reports. In the same interview, D’Agostino refers to the intervention’s ‘strong effect sizes’ and elaborates: ‘…we know that the outcomes are remarkable – most of the lowest-achieving first graders accelerate with Reading Recovery and reach the average of their cohort.’
Leaving aside how one defines ‘their cohort’, these broad claims do not match the evidence well at all. When D’Agostino comments: ‘Schools don’t necessarily hear about government funded initiatives that achieve high evidence standards according to the What Works Clearinghouse’, the reader may understandably infer that Reading Recovery has achieved such high evidence standards. In fact, the website of What Works Clearinghouse, the US government-funded body intended to help make reliable research available to educators, describes the extent of evidence for Reading Recovery as ‘small’ for ‘potentially positive effects’. In fact, of 202 studies reviewed by WWC, just three met their evidence standards. Three. Once again, the claims and the details do not stack up.
We can look further back. In the TES, on 28 August 2009, an article claimed that ‘Britain’s Reading Recovery initiative is one of the world’s best programmes for struggling readers, an international study has found’. That study, by the Institute for Evidence in Education, supposedly found that ‘Reading Recovery, which involves specially trained teachers working one-to-one with pupils using a mix of methods, is one of three for which there is strong evidence of effectiveness’. (The note on ‘mix of methods’ is a classic warning signal. It means: we have added a phonics component, because we know we have to be credible, but we’re not changing our foundations). The director of the institute, Professor Robert Slavin, said the two (that’s two) UK studies in the report were very positive. The TES article described the findings as ‘a boost for the literacy scheme, which is at the heart of the government’s Every Child A Reader programme yet has been criticised by some academics who question its value for money’.*
However, in the body of the report, entitled Educators Guide: What Works for Struggling Readers, Reading Recovery is deemed to have an effect size of just 0.23 across eight studies. The authors write: ‘although the outcomes for Reading Recovery were positive, they were less so than might have been expected’. Once again, there is a mismatch between the media presentation and the details of the evidence.
Last year, Kevin Wheldall (Emeritus Professor of Macquarie University) wrote this blogpost, about a report on Reading Recovery’s long-term effectiveness. The author of the report, Jane Hurry, is based at the Institute of Education in London, which Wheldall describes as ‘the British home of Reading Recovery’. The report is published by the Institute for Evidence in Education. Hurry makes broad claims of a similar nature to those above: ‘the programme is known to have impressive effects in the short term’ and ‘substantial gains’ are continued to the end of primary school. Neither of these claims is substantiated by the report.
Wheldall highlights how the headlines in the study did not match the detail and were open to a quite different interpretation, concluding: ‘even the significant differences between the two groups in the Reading Recovery Schools and the non-Reading Recovery school are accompanied by only small effect sizes, all of which are below Hattie’s hinge value of 0.4’. Once again, the substance is different from the marketing.
Is there a pattern here? It seems obvious to me. Is it a conspiracy of some kind? A conspiracy is entirely unnecessary to bring about a favourable slant in a report on a weak set of data. But I do think there may be a convergence of interests here, where research findings of ‘the right sort’ can be presented in one way in the headlines, while the fine print says something rather different. It is all about who writes the reports and who they are aligned with. For example – and this is not an aspersion on character, but a question about sympathies between organisations – a researcher from the Institute for Evidence in Education was seconded to the Education Endowment Foundation last year – a researcher who described himself at the National Literacy Trust conference in January as ‘Reading Recovery trained’. Is there a connection between this and the mismatch of headline and detail in the EEF report? I can only ask the question, but the patterns I have seen suggest that strong links between research groups can lead to some strange outcomes.
The bottom line remains the same: until teachers have sufficient knowledge, skills and concern to evaluate research rigorously, educational researchers will continue to present findings in ways which may well serve converging interests, but perhaps not the unvarnished truth.
*[Yes, that’s correct – Every Child A Reader, a scheme costing many millions of pounds, had as its core intervention Reading Recovery. ECAR was indeed criticized for cost in February 2009, just six months before the IEE report found Reading Recovery’s results ‘positive’. Policy Exchange expressed concern that the government had committed £144 million to a national roll-out of the scheme before any independent evaluations had been completed of the initial trials. Both Policy Exchange, and the independent report eventually published via the DfE in 2011, found that the Reading Recovery element of ECAR did not provide sufficient impact to justify its high ongoing costs. Interestingly, the independent evaluation, most of whose authors were from the National Centre for Social Research, does not contain one effect size measurement. Positive effects on reading are either described in percentage points, (a meaningless measure without information on spread) or assessments of ‘good’ or ‘very good’ by teachers.]