Tony Leech

Tony Leech

Tony Leech

I am interested in a broad range of assessment, policy and curriculum issues. My current projects involve the analysis of assessment material, aspects of the role of digital technology in high stakes assessment, assessment design principles, and patterns of formative assessment internationally. I am also interested in debates on the future and purposes of assessment, the place of vocational and technical qualifications in learners’ programmes of study, access to higher education and social justice in education. I use various research methods, both qualitative and quantitative, and have written and presented for a variety of internal and external audiences.

Before becoming a Research Officer at ARD in 2021, I was a Research Assistant at OCR. During that time, I worked on, among other things, a large multifaceted programme of research into the usability of comparative judgement methods for awarding, and investigated core assessment models for OCR qualifications, the impact of changes to the assessment of practical science and the future landscape for OCR's post-16 vocational and technical qualifications.

I have a BA and an MPhil from the University of Cambridge, specialising in politics, and previously worked in legal market research. In my spare time, I enjoy spending time outside, political campaigning, reading speculative fiction and playing the guitar poorly.

Publications

2022

How do judges in Comparative Judgement exercises make their judgements?

Leech, T. & Chambers, L. (2022). How do judges in Comparative Judgement exercises make their judgements? Research Matters: A Cambridge University Press & Assessment publication, 33, 31–47.

Two of the central issues in comparative judgement (CJ), which are perhaps underexplored compared to questions of the method’s reliability and technical quality, are “what processes do judges use to make their decisions” and “what features do they focus on when making their decisions?” This article discusses both, in the context of CJ for standard maintaining, by reporting the results of both a study into the processes used by judges when making CJ judgements, and the outcomes of surveys of judges who have used CJ. In the first instance, using insights from observations of judges and their being asked to think aloud while they judged, we highlight the variety of processes used when making their decisions, including comparative reference, re-marking and question-by question evaluation. We then develop a four dimension model to explore what impacts what judges attend to, and explore through survey responses the distinctive ways in which the structure of the question paper, different elements of candidate responses, judges’ own preferences and the CJ task itself affect decision-making. We conclude by discussing, in the light of these factors, whether the judgements made in CJ (or in the judgemental element of current standard maintaining procedures) are meaningfully holistic, and whether judges can properly take into account differences in difficulty between different papers.

A summary of OCR’s pilots of the use of Comparative Judgement in setting grade boundaries

Benton, T., Gill. T., Hughes, S., & Leech. T. (2022). A summary of OCR’s pilots of the use of Comparative Judgement in setting grade boundaries. Research Matters: A Cambridge University Press & Assessment publication, 33, 10–30.

The rationale for the use of comparative judgement (CJ) to help set grade boundaries is to provide a way of using expert judgement to identify and uphold certain minimum standards of performance rather than relying purely on statistical approaches such as comparable outcomes. This article summarises the results of recent trials of using CJ for this purpose in terms of how much difference it might have made to the positions of grade boundaries, the reported precision of estimates and the amount of time that was required from expert judges.

The results show that estimated grade boundaries from a CJ approach tend to be fairly close to those that were set (using other forms of evidence) in practice. However, occasionally, CJ results displayed small but significant differences with existing boundary locations. This implies that adopting a CJ approach to awarding would have a noticeable impact on awarding decisions but not such a large one as to be implausible. This article also demonstrates that implementing CJ using simplified methods (described by Benton, Cunningham et al, 2020) achieves the same precision as alternative CJ approaches, but in less time. On average, each CJ exercise required roughly 30 judge-hours across all judges.

2020

Does comparative judgement of scripts provide an effective means of maintaining standards in mathematics?
Benton, T., Hughes, S., and Leech, T. (2020). Does comparative judgement of scripts provide an effective means of maintaining standards in mathematics? Cambridge Assessment Research Report. Cambridge, UK: Cambridge Assessment.
Comparing the simplified pairs method of standard maintaining to statistical equating
Benton, T., Cunningham, E., Hughes, S., and Leech, T. (2020). Comparing the simplified pairs method of standard maintaining to statistical equating. Cambridge Assessment Research Report. Cambridge, UK: Cambridge Assessment.

Research Matters

Research Matters 32 promo image

Research Matters is our free biannual publication which allows us to share our assessment research, in a range of fields, with the wider assessment community.