No stone unturned

Nor do we find him forward to be sounded,

But, with a crafty madness, keeps aloof,

When we would bring him on to some confession

Of his true state. – Hamlet

Just because Reading Recovery has a track record of inflating claims, and a poor record of solving reading problems in the long term, does that mean that everything about Reading Recovery is bad? It makes good sense to identify and give extra help to children who are falling behind, and who could argue with sessions which focus on “phonemic awareness, phonics, vocabulary, fluency, comprehension and composition”?

The problem is that the way they present their programme is not always consistent with the substance. The description of lesson content above is a good example. For years Reading Recovery resisted calls from as far back as 1992 to add a significant phonics component. Once US and UK funding became dependent on ‘evidence-based approaches’ such as phonics, Reading Recovery began to describe its programmes as including phonics – but nowhere will you find a statement that this phonics is explicit, systematic, synthetic or linguistic. Likewise, the term ‘fluency’ can be interpreted a number of ways. Exactly what skills they build to fluency, their criteria for fluency, and how they build fluency, are not revealed to the uninitiated. Possibly it just means that students read a little faster than they did before.

So I was interested in Greg Ashman’s post Is Reading Recovery Like Stone Soup?, in which he queries a study on the effectiveness of that programme in the United States.  Ashman’s point is primarily that the RCT doesn’t tell us whether it is Reading Recovery techniques that cause the effect, or whether the improvement is due to other components of the delivery mechanism such as one-to-one instruction in general. He argues that a more scientific ‘fair test’ would control all the variables and manipulate them to identify which had the most impact. I began to write a comment, but it became too long – hence this post.

The first question I had was – just what is the effect? This is because there is a long history of Reading Recovery headlines reporting much more favourable impacts than the details of the reports actually show. (See for example A Convergence of Interests on this blog and Small Bangs for Big Bucks by Professor Kevin Wheldall).

A link in the comments took me to the second year of the study. Impacts in that year were described in the following way:

1. The experimental group, who received Reading Recovery tuition on top of regular classroom instruction, scored 14 points higher on the Iowa Test of Basic Skills than the control group, who received only regular classroom instruction. This equated to an effect size of 0.42 against the control group and an effect of 0.33 against first-graders nationally. The authors said that these were large gains for this type of study.

2. The results were ‘benchmarked against expected gains’ on the Iowa test battery for first graders, and it was found that the experimental group exceeded ‘expected gains’ by 3.03 points, which equates to about 1.4 months additional gain over the instructional period of five months.

The effect sizes reported are not poor, though they may be unremarkable. For comparison, the authors discuss weaker effects in ‘typical educational studies’. However, I see no reason to ignore John Hattie’s assertion that interventions with effect sizes below 0.4 probably aren’t worth investing in. Saying that there are lots of studies with worse results is not an argument that this intervention has impact.

Problems arise, though, when we dig a little deeper. First, as one of the commenters on the Stone Soup post pointed out, the gains still do not reach national averages despite the intensive nature of the intervention.  Secondly, although this was not in the headlines of the study, the pre-test used was actually the Clay Observation Survey, a Reading Recovery instrument with six ‘sub-tests’ including letter naming and concepts about print. The post-test was the Iowa Test of Basic Skills. Although the authors justify the use of the Clay survey by describing its correlation with other tests, I am puzzled as to why they would not have used the ITBS as both a pre- and post- test measure. It makes it more difficult to trust claims of ‘progress’ on the ITBS when only a post-test measure was taken. (This information only becomes apparent in the appendices – although if I have misread it in some way I am happy to be corrected.)

A further conundrum is that the subject sample for the RCT is focused on the lowest scoring eight students in each school. This tighter range could restrict the standard deviation of the experimental group, and this increases the apparent effect size. So the effect size – as always – needs some careful interpretation when considering how well the intervention might generalise across the population.

A fourth question that arises is whether the gains will be sustained, a long-standing concern with Reading Recovery programmes.  Measuring the longer term effects is one of the aims of the current study. Despite being in its second year, there is no publication of such evidence yet. If the study reaches its fourth year and then discovers that the short-term gains have not held up, the investment will have proven futile – and the many children who took part will have had little benefit from the injection of public funds.

Greg Ashman’s question about which variables are effective highlights a general weakness of RCTs, at least in education – it is difficult to control for all variables across large populations. As a result, the statistical power is offset, at least to some extent, by a loss of explanatory power. There is another way: it is possible to have the best of both worlds by complementing or preceding RCTs with single-subject quasi-experimental designs. These allow researchers to control variables much more tightly, and to demonstrate the importance of different variables by using reversal to baseline or multiple baseline designs, (see here for examples).

With respect to whether there is evidence for the effectiveness of one-to-one instruction and other variables, there is an interesting study by Camilli, Vargas and Yurecko (2003). They re-analysed the 2000 report of the National Reading Panel and concluded (amongst other things) that systematic phonics teaching, structured language teaching, and one-to-one instruction all had significant effects, and that these were additive – in other words, they could be combined to triple the effect of phonics instruction alone. There is considerable evidence already in existence regarding the instructional variables that can be managed to achieve greater student progress – but this research remains largely unknown or ignored by educators because its provenance as ‘instruction’ immediately makes it suspect.

Should we be excited or skeptical about the latest claims for Reading Recovery? I am skeptical. Digging below the headlines with Reading Recovery studies always seems to yield the same problems: achievement below national averages, narrow selection of subjects, carefully worded descriptions that obfuscate the aspects of context and ‘visual information’ at the heart of the theoretical model underlying the intervention – and modest effect sizes which are talked up to sound more impressive than they are.

I am all for helping children to learn to read. So far, I can’t find any good evidence to suggest that Reading Recovery helps much.

This entry was posted in Uncategorized and tagged , , , , , , , . Bookmark the permalink.

4 Responses to No stone unturned

  1. 3rsplus says:

    The researchers presented a paper at the March 2015 meeting of the Society of Research on Educational Effectiveness. If you google for SREE 2015 Final-Year Results from the i3 Scale-Up you should be able to navigate to it.

    Here are the Conclusions:

    The consistently large positive impacts of Reading Recovery under the i3 scale-up suggest that this relatively large investment has led to substantial improvements in the reading performance of many thousands of students across the nation. It also serves as a point of validation for the Investing in Innovation (i3) program model—the size of an investment in an educational intervention should be proportionate to its prior evidence of effects.

    What the conclusions don’t say is that the maximum difference of the “large positive impacts” is 3 points, whether raw score or scale score, and whether total score or subscore on the Iowa Test of Basic Skills. The “I3 program model” was an investment of $4.3 billion. No one involved in the evaluation expects RR to be sustainable when the grant financing ends–irrespective of the “impact.”

    The conclusion I’d draw: “Beware of statistical geeks bearing effect sizes.”

    • Thank you very much for that information. I am stunned by the sums involved! The closer I look at RR, the shadier the ‘research’ gets. I can only wonder if there is a strong sympathy of philosophy between government purse-holders and the whole language roots of RR.

  2. 3rsplus says:

    Reading Recovery (Trademarked) is not just an “instructional programme.” It’s a multinational academic-corporate enterprise that benefits participating individuals and institutions. In the US it has two journals, runs an annual national conference, hires professional lobbyists, etc. All financed by taxpayer money–with benefits to students amounting to 3 items on a “fill in the bubbles” test.

    Look at the news cycle for the “Randomized Control Trial” at issue here. The news reported at SREE in March will be reported in a Technical Report later this year and will be picked up by the specialized ed press as “large positive impact. The news will enter the “reputable journal” literature one or two years later, but by that time RR will have changed its rhetoric to be in line with the ed bumper-sticker slogans of the day, consistent with what it has learned how to do throughout its history.

    However. . .. Many other events will be transpiring in this projected time period.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s