High stakes in adverse conditions

High stakes in adverse conditions

Simon Lebus explores the effects of adverse conditions in high stakes exams and discusses whether we could or indeed should try to counter them.

An intriguing report by Professor Victor Lavy and colleagues from the University of Warwick's Centre for Competitive Advantage in the Global Economy (CAGE), recently published by the Social Market Foundation, examines the impact of ambient air pollution on student exam performance in Israeli Matriculation Exams and identifies a correlation. The analysis is designed to demonstrate the impact of random factors - which could just as well be something like migraine, hay fever or a bad night's sleep - on exam performance and goes on from this to conclude that public exam results are an inefficient basis for the allocation of scarce progression opportunities such as good university places and high status, high-reward jobs.

Relating this to A Levels and GCSEs, it is worth noting that these exams are made up of multiple papers at least in part precisely to compensate for the impact of such random factors. However, this raises the interesting question of whether it is meaningful to suggest that the 'fairness' of an exam (in any event always a form of sampling) is compromised by random external factors any more than by the peculiarities of its own design.

To illustrate the point, I remember several years ago receiving a letter from an aggrieved Latin private GCSE candidate (she was a grandmother who had taken up Latin out of solidarity with her grandson who was studying the subject at school) who wrote in to complain that she had been fully capable of doing the translation in her Latin unseen paper but would have needed substantially longer than the allocated 45 minutes in order to complete it. Her complaint was that the artificial constraint on the time available for her to complete the task meant that she was unable to demonstrate properly what she was capable of. Was she right? At one level, it may indeed be true that she could've performed the task flawlessly given unlimited time, but this ignores the reality that part of the purpose of the test is to determine the facility with which a candidate can complete it, and also to rank order the candidates in order to grade them. So, while admiring her spirit and her grandmotherly commitment, I ended up writing back to say that I thought her story demonstrated that the exam had operated precisely in the way it should have done because what she interpreted as a random time constraint actually allowed the examiner to draw valid conclusions about her levels of fluency.

The pollution the report draws attention to is obviously an element of extraneous rather than design randomness but part of the specialist expertise involved in developing high stakes assessment resides in the systematic work we do to make the playing field as level as possible, an important element of which comprises subsequent analysis to verify that our exams possess reasonable predictive validity. If you really want variation to play a role then go with 'local' assessment in schools – where a much greater range of variables is likely to apply, ranging from the physical environment of the classroom (noise, light, discipline and so on) to the nature of the tasks, the facilities, the 'fairness' of the teacher's judgement, possible support from the teacher and so on. Remedies to exclude such variation range from the expensive to the draconian to the impossible. If we were to try to mitigate every external variable (and not just pollution) we would end up chasing myriad effects and influences, with no necessary improvement in the predictive validity of qualifications – which currently are at levels which enjoy public confidence.

Despite legitimate questions about the design and role of high stakes assessment, it is important to recognise that trying to design assessments around the principle of excluding every element of randomness would (perhaps counter-intuitively) end up likely introducing even more randomness, with a consequent adverse impact on both equity and attainment.

Simon Lebus
Group Chief Executive, Cambridge Assessment