Research Matters 04

  • Research Matters 4 Foreword

    Oates, T. (2007). Foreword. Research Matters: A Cambridge Assessment publication, 4, 1.

    As new technologies begin to emerge in assessment, it sometimes feels as if progress is being made on delivery mechanisms without commensurate development in understanding of measurement and related aspects of assessment processes.


  • Research Matters 4 Editorial

    Green, S. (2007). Editorial. Research Matters: A Cambridge Assessment publication, 4, 1.

    The main themes of this issue relate to the psychology of marking, cognitive processes affecting accuracy, and issues related to quality assurance in marking processes.


  • The ‘Marking Expertise’ projects: Empirical investigations of some popular assumptions

    Suto, I. and Nadas, R. (2007). The ‘Marking Expertise’ projects: Empirical investigations of some popular assumptions. Research Matters: A Cambridge Assessment publication, 4, 2-5.

    Recent transformations in professional marking practice, including moves to mark some examination papers on screen, have raised important questions about the demands and expertise that the marking process entails. What makes some questions harder to mark accurately than others, and how much does marking accuracy vary among individuals
    with different backgrounds and experiences? We are conducting a series of interrelated studies, exploring variations in accuracy and expertise in GCSE examination marking.

    In our first two linked studies, collectively known as Marking Expertise Project 1, we investigated marking on selected GCSE maths and physics questions from OCR’s June 2005 examination papers. Our next two linked studies, which comprise Marking Expertise Project 2, are currently underway and involve both CIE and OCR examinations. This time we are
    focussing on International (I) GCSE biology questions from November 2005 and GCSE business studies questions from June 2006. All four studies sit within a conceptual framework in which we have proposed a number of factors that might contribute to accurate marking. For any particular GCSE examination question, accuracy can be maximised through increasing the marker’s personal expertise and/or through decreasing the demands of the marking task, and most relevant factors can be grouped according to which of these two routes they contribute to. In this article, we present a summary of some key aspects and findings of the two studies comprising our first project. We end by looking ahead to our second  project on marking expertise, which is currently in progress.


  • Did examiners' marking strategies change as they marked more scripts?

    Greatorex, J. (2007). Did examiners' marking strategies change as they marked more scripts? Research Matters: A Cambridge Assessment publication, 4, 6-13.

    Prior research used cognitive psychological theories to predict that examiners might begin marking a question using particular cognitive strategies but later in the marking session they might use different cognitive strategies. Specifically, it was predicted that when examiners are familiar with the question paper, mark scheme and candidates’ responses they:

    •    use less ‘evaluating’ and ‘scrutinising’
    •    more ‘matching’

    This research tests these predictions.  All Principal Examiners (n=5), Team Leaders (n=5) and Assistant Examiners (n=59) who marked in the winter 2005 session were sent a questionnaire. The questionnaire asked about different occasions in the marking session.  It was found that sometimes examiners’ marking strategies changed as the examiners marked more scripts.  When there were considerable changes in cognitive strategies these were mostly in the predicted direction.


  • Researching the judgement processes involved in A-level marking

    Crisp, V. (2007). Researching the judgement processes involved in A-level marking. Research Matters: A Cambridge Assessment publication, 4, 13-18.

    The marking of examination scripts by examiners is a key part of the assessment process in many assessment systems. Despite this, there has been relatively little work to investigate the process of marking at a cognitive and socially-framed level. Improved understanding of the judgement processes underlying current assessment systems would also leave us better prepared to anticipate the likely effects of various innovations in examining systems such as moves to on-screen marking.

    An AS level and an A2 level geography exam paper were selected. Six experienced examiners who usually mark at least one of the two papers participated in the research. Examiners marked fifty scripts from each exam at home with the marking of the first ten scripts for each reviewed by the relevant Principal Examiner. This reflected normal marking procedures as far as possible. Examiners later came to meetings individually where they marked four or five scripts in silence and four to six scripts whilst thinking aloud for each exam, and were also interviewed.

    The findings of this research support the view that assessment involves processes of actively constructing meaning from texts as well as involving cognitive processes. The idea of examining as a practice that occurs within a social framework is supported by the evidence of some social, personal and affective responses. Aspects of markers’ social histories as examiners and teachers were evident in the comparisons that they made and perhaps more implicitly in their evaluations. The overlap of these findings with aspects of various previous findings helps to validate both current and previous research, thus aiding the continued development of an improved understanding of the judgement processes involved in marking.


  • Quality control of examination marking

    Bell, J. F., Bramley, T., Claessen, M. J. A. and Raikes, N. (2007). Quality control of examination marking. Research Matters: A Cambridge Assessment publication, 4, 18-21.

    As markers trade their pens for computers, new opportunities for monitoring and controlling marking quality are created. Item-level marks may be collected and analysed throughout marking. The results can be used to alert marking supervisors to possible quality issues earlier than is currently possible, enabling investigations and interventions to be made in a more timely and efficient way. Such a quality control system requires a mathematical model that is robust enough to provide useful information with initially relatively sparse data, yet simple enough to be easily understood, easily implemented in software and computationally efficient – this last is important given the very large numbers of candidates assessed by Cambridge Assessment and the need for rapid analysis during marking. In the present article we describe the models we have considered and give the results of an investigation into their utility using simulated data.


  • Quantifying marker agreement: terminology, statistics and issues

    Bramley, T. (2007). Quantifying marker agreement: terminology, statistics and issues. Research Matters: A Cambridge Assessment publication, 4, 22-28.

    One challenge facing assessment agencies is in choosing the appropriate statistical indicators of marker agreement for communicating to different audiences. This task is not made easier by the wide variety of terminology in use, and differences in how the same terms are sometimes used. The purpose of this article is to provide a brief overview of: i) the different terminology used to describe indicators of marker agreement; ii) some of the different statistics which are used and; iii) the issues involved in choosing an appropriate indicator and its associated statistic. It is hoped that this will clarify some ambiguities which are often encountered, and contribute to a more consistent approach in reporting research in this area.


  • Agreement between outcomes from different double marking models

    Vidal Rodeiro, C. L. (2007). Agreement between outcomes from different double marking models. Research Matters: A Cambridge Assessment publication, 4, 28-34.

    In the context of marking examinations, double marking is a means to enhance reliability. However, deciding if it is worthwhile incorporates a dilemma. Intuitively, it increases the reliability of the assessment and shows fairness in marking, but it needs to be proven a benefit in order to justify the additional time and effort that it takes. One factor which affects the re-marking is whether or not the second marker is aware of the marks awarded by the first marker. Higher agreement is observed between two examiners when the second knows how, and perhaps why, the first marked an exam. This may suggest that the second examiner took advantage of the annotations available when trying to judge the best mark for a candidate. An alternative perspective may suggest that the second examiner was influenced by the first examiner’s marks.

    The purpose of this research is to evaluate the extent to which examiners agree when using different double marking models, in particular, blind and annotated double marking. The impact of examiner experience is also investigated.


  • Item-level examiner agreement

    Raikes, N. and Massey, A. (2007). Item-level examiner agreement. Research Matters: A Cambridge Assessment publication, 4, 34-37.

    Studies of inter-examiner reliability in GCSE and A-level examinations have been reported in the literature, but typically these focused on paper totals, rather than item marks. See, for example, Newton (1996). Advances in technology, however, mean that increasingly candidates’ scripts are being split by item for marking, and the item-level marks are routinely collected. In these circumstances there is increased interest in investigating the extent to which different examiners agree at item level, and the extent to which this varies according to the nature of the item. Here we report and comment on intraclass correlations between examiners marking sample items taken from GCE A-level and IGCSE examinations in a range of subjects. The article is based on a paper presented at the 2006 Annual Conference of the British Educational Research Association.


  • Fostering communities of practice in examining

    Watts, A. (2007). Fostering communities of practice in examining. Research Matters: A Cambridge Assessment publication, 4, 37-38.

    This is a shortened version of a paper given at the International Association for Educational Assessment (IAEA) conference in May 2006.


  • Research News

    The Research Division (2007). Research News. Research Matters: A Cambridge Assessment publication, 4, 39.

    A summary of recent conferences and seminars, and research articles published since the last issue of Research Matters.