A question of rigour

Sure, he that made us with such large discourse,

Looking before and after, gave us not

That capability and godlike reason

To fust in us unus’d.” – Hamlet

There is a scene that recurs in action movies, usually towards the climax, where a large vehicle (train, bus, lorry or aeroplane for example) careers out control, smashing into one barrier after another, slowly disintegrating even as it demolishes the structures we expect to contain it. While this can be cathartic in an action movie, it is less uplifting when one encounters it in a report that cost £147,000 and that was supposed to tell us something useful.

The EEF report on Accelerated Reader published in February of this year at first seems neither problematic nor particularly inspiring. A randomized control was run across four secondary schools involving 349 pupils. The first sign of a possible derailment is when we learn that the overall effect size was 0.24, equivalent to three months’ additional progress during the twenty weeks, or half a school year, over which the trial was run. This is a mediocre effect size; if we compare it to the findings of John Hattie’s massive meta-analysis, it does not even reach the ‘hinge’ 0.4 which is the average impact of educational interventions. In other words, it does not reach Hattie’s criterion for ‘worth investing in’.

This does not, however, prevent the EEF from declaring that AR is “an effective intervention”. This is consistent with other EEF studies which also scored around the 0.25 mark. One possible justification for assessing such a grading as effective is that the cost – cited at £9 per pupil per year – means that the intervention is cheap to implement even though it only achieves what the EEF admits is a “modest” effect.

To evaluate this, we would need to consider not only the £9 cost of the most basic licence type for each student, but also the other resources which the EEF states are required for implementation: computers or tablets, internet, and access to a wide variety of books graded according the Renaissance Learning ‘readability formula’. Schools in the trial bought books in order to set up Accelerated Reader, and employed staff to band books and to inventory them. This resourcing was paid for by schools using funding from the trial. However, although the EEF must have precise figures on this expenditure, no detail of these costs is made in the report.

Screen Shot 2015-08-26 at 13.07.06

We know that the overall funding for the trial was £147,000. In the absence of a breakdown of the project’s funding, one can only guess at the true cost of implementation. Assuming that the training and evaluation cost £70,000, and that licences cost £9 per head for a total of £3,140, we can estimate that the remainder of funding went to the resourcing described above – about £74,000. This works out at £206 per pupil – a discrepancy of about 2,300%. So much for rigour.

There are, of course, other costs that researchers and schools often fail to include. The opportunity cost of taking students away from other lessons seems never to be considered, perhaps because it is difficult to estimate. In this case, students spent 180- 200 minutes per week reading books with Accelerated Reader  – in one school after lessons, in others during the school day. In  some cases at least three lessons were being missed. Two questions arise:

  1. What was the impact on their learning in the subjects that they missed? Is it justified by the progress they made?
  2. Could this time have been used with greater impact on their reading through another intervention?

The first question is simply ignored, though the three months’ additional progress suggests they may as well have stayed in class. The second could not be answered because the design of the study simply compares the treatment group with a ‘business as usual’ control group. Such comparisons do not yield enough information to move educational practice forward. Why, when we have the money, the expertise and the opportunity, do we not compare interventions to see which is more effective? Rather, we have a trickle of data that has almost no impact but instead generates a slowly rising tide of antipathy towards research.

The study then lurches into collision with the most obvious problem we could have expected: what to do with children who can’t read independently? The teachers did what any self-respecting teacher would do: they helped the individual students with their reading. We are not told how many children were helped in this way, or how much. So we have no idea how much this one-to-one support, which was in addition to the AR intervention, contributed to the admittedly modest gains described. Having identified this problem, the report carries on without suggesting any solutions.

The last, unexpected part of the disintegration comes when we look at how the effect size is generated. The post-test is the NGRT Digital, a common, well-researched and statistically strong test. No problem there. What is the pre-test? Answer: none. Students were allocated to the intervention based on their Key Stage 2 English points. This measure alone has been used to establish comparability between the randomized groups. The effect size is simply the difference between the mean reading post-test score of the treatment and control groups.

The writers appear to argue that, because the randomization process made for comparability between the groups, no pre-test is required.1 But one of the long-running complaints that secondary teachers have about the Key Stage 2 scores is that they often seem to bear little or no relation to the children in front of us. Given the potentially distorting effects of pressures on schools, there is good reason to be wary of relying on these scores for anything – quite apart from the variations in teacher interpretation that can occur between individuals and schools. In other words, the grouping criterion is not reliable.

The report stresses that the NGRT was simple to administer as a post-test. Then why not use it as a pre-test to ensure comparability between students? One of the problems this would have identified was that of students who were not able to read independently, thereby eliminating the major problem in interpretation of the results (referred to above). It cannot, given the size of the budget, have been a cost-saving measure – or if so, it suggests very poor judgement.

In short, the EEF’s report has provided little evidence to help schools decide whether to use Accelerated Reader. AR is, as far as we can tell, better than nothing, but may cost a lot more than it first appears. Far from revolutionizing education through evidence, this report and others like it appear to be creating opacity, not clarity. A cynic would say that four schools had found a way to fund extra books and computer tablets, a view that appears to be supported by the writers’ complaint of a lack of interest by any of the schools in analyzing and reporting on their own results. A more optimistic view is stated in the report, which opines that AR is a well-researched intervention and that this trial provides the largest and most reliable study so far.

I do not agree that AR is well-researched; it is certainly not well-evidenced. Nothing here convinces me that it is better than having a well-stocked library, with clear goals for each student, and the time to get them reading.

Notes:

  1. If they are arguing something different, I haven’t understood their point and am happy to be corrected.
Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

3 Responses to A question of rigour

  1. ollieorange2 says:

    There are two different definitions for the Effect Size. Hattie mixes them up because he doesn’t know what he’s doing, but, generally he uses the first version. for which better than 0.40 would be “good”. Whereas, the EEF use another definition where better than zero is “good”. I have written about it here – https://ollieorange2.wordpress.com/2014/08/10/the-two-kinds-of-effect-size/

    Incidentally, at the first Research Ed conference, Prof Coe said that the 0.40 hinge point was “not entirely helpful” which I believe is polite, academic speak for a load of bollocks. It was the only thing we agreed about that day.

  2. Pingback: 8 DIY Steps to Build a Reading Culture | Horatio Speaks

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s