Research Matters 07

Research Matters 7 - Foreword

Oates, T. (2009). Foreword. Research Matters: A Cambridge Assessment publication, 7, 1.

While the contributions in this issue each provide insights into diverse matters in assessment one article in particular warrants special attention. Gray’s and Shaw’s contribution on ‘de-mystification’of the Uniform Mark Scheme (UMS) is a valuable example of researchers ‘drawing breath and taking stock’ of processes and concepts located deeply within procedures for administering contemporary public examinations.

Download
Research Matters 7 - Editorial

Green, S. (2009). Editorial. Research Matters: A Cambridge Assessment publication, 7, 1.

Most of the articles in this issue report on research that was presented at the British Educational Research Association (BERA) and/or the International Association for Educational Assessment (IAEA) conferences during the autumn of 2008.

Download
Keynote presentations to the International Association for Educational Assessment (IAEA) 2008 Annual Conference

Green, S. (2009). Keynote presentations to the International Association for Educational Assessment (IAEA) 2008 Annual Conference. Research Matters: A Cambridge Assessment publication, 7, 2-3.

The 34th IAEA annual conference, hosted by Cambridge Assessment, took place in Robinson College, University of Cambridge from September 7th to 12th. The main conference theme was 'Re-interpreting Assessment: Society, Measurement and Meaning'. The highlights of the event were the two keynote presentations by Professor Robert Mislevy ('Some implications of expertise research for educational assessment') and Professor Dylan Wiliam ('What do you know when you know the test results? The meanings of educational assessments'). This article briefly summarises both presentations.

Download
Grading examinations using expert judgements from a diverse pool of judges

Raikes, N., Scorey, S. and Schiell, H. (2009). Grading examinations using expert judgements from a diverse pool of judges. Research Matters: A Cambridge Assessment publication, 7, 4-8.

In normal procedures for grading GCE Advanced level and GCSE examinations, an Awarding Committee of senior examiners recommends grade boundary marks based on their judgement of the quality of scripts, informed by technical and statistical evidence. The aim of our research was to investigate whether an adapted Thurstone Pairs methodology (see Bramley and Black, 2008; Bramley, Gill and Black, 2008) could enable a more diverse range of judges to take part. The key advantage of the Thurstone method for our purposes is that it enables two examinations to be equated via judges making direct comparisons of scripts from both examinations, and does not depend on the judges’ internal conceptions of the standard required for any grade.

A General Certificate of Education (GCE) Advanced Subsidiary (AS) unit in biology provided the context for the study reported here. The June 2007 and January 2008 examinations from this unit were equated using paired comparison data from the following four groups of judges: members of the existing Awarding Committee; other examiners that had marked the scripts operationally; teachers that had taught candidates for the examinations but not marked them; and university lecturers that teach biology to first year undergraduates.

We found very high levels of intra-group and inter-group reliability for the scales and measures estimated from all four groups’ judgements. When boundary marks for January 2008 were estimated from the equated June 2007 boundaries, there was considerable agreement between the estimates made from each group’s data. Indeed for four of the boundaries (grades B, C, D and E), the estimates from the Awarders’, examiners’ and lecturers’ data were no more than one mark apart, and none of the estimates were more than three marks apart.

We concluded that the examiners, teachers, lecturers and members of the current Awarding Committee made very similar judgments, and members of all four groups could take part in a paired comparison exercise for setting grade boundaries without compromising reliability.

Download
Using ‘thinking aloud’ to investigate judgements about A-level standards: Does verbalising thoughts result in different decisions?

Greatorex, J. and Nádas, R. (2009). Using ‘thinking aloud’ to investigate judgements about A-level standards: Does verbalising thoughts result in different decisions? Research Matters: A Cambridge Assessment publication, 7, 8-16.

The ‘think aloud’ method entails people verbalising their thoughts while they do tasks, resulting in ‘verbal protocols’. The verbal protocols are analysed by researchers to identify the cognitive strategies and processes as well as the factors that affect decision making. Verbal protocols have been widely used to study decisions in educational assessment. The main methodological concern about using verbal protocols is whether thinking aloud compromises ecological validity (the authenticity of the thought processes) and thus the decision outcomes. Researchers have investigated to what extent verbalising affected the thinking processes under investigation in a variety of settings. Currently, the research literature generally is inconclusive; most results show just longer performance times and no alternative task outcome.

Previous research on marking collected decision outcomes from two conditions:
1. marking silently;
2. marking whilst thinking aloud.
The mark to re-mark differences were the same in the two conditions. However, it is important to confirm whether verbalising affects decisions about grading standards. Therefore, our main aim was to compare the outcomes of senior examiners making decisions about grading standards silently as opposed to whilst thinking aloud. Our article draws from a wider project taking three approaches to grading.

In experimental conditions, senior examiners made decisions about A-level grading standards for a science examination both silently and whilst thinking aloud. Three approaches to grading were used in the experiment. All scripts included in the research had achieved a grade A or B in the live examination. The decisions from the silent and verbalising conditions were statistically compared.

Our interim findings suggest that verbalising made little difference to the participants’ decisions; this is in line with previous research in other contexts. The findings reassure us that the verbal protocols are a useful method for research about decision making in both marking and grading.

Download
Can emotional and social abilities predict differences in attainment at secondary school?

Vidal Rodeiro, C.L., Bell, J.F. and Emery, J.L. (2009). Can emotional and social abilities predict differences in attainment at secondary school? Research Matters: A Cambridge Assessment publication, 7, 17-23.

Trait emotional intelligence (trait EI) covers a wide range of self-perceived skills and personality dispositions such as motivation, confidence, optimism, peer relations and coping with stress. In recent years, there has been a growing awareness that social and emotional factors play an important part in students’ academic success and it has been claimed that those with high scores on a trait EI measure perform better. This research investigated whether scores on a questionnaire measure of trait EI were related to school performance in a sample of British pupils.

Trait EI was measured with the Trait Emotional Intelligence Questionnaire. Participants completed the questionnaire prior to the June 2007 examination session and their responses were matched to their Key Stage 3 and GCSE results.

The research showed that some aspects of trait EI (motivation and low impulsivity) as well as total trait EI were significant predictors of attainment in GCSE subjects after controlling for prior attainment at school.

Download
Assessment instruments over time

Elliott, G., Curcin, M., Bramley, T., Ireland, J., Gill, T. and Black, B. (2009). Assessment instruments over time. Research Matters: A Cambridge Assessment publication, 7, 23-25.

As Cambridge Assessment celebrated its 150th anniversary in 2008 members of the Evaluation and Psychometrics Team looked back at question papers over the years. Details of the question papers and examples of questions were used to illustrate the development of seven subjects: Mathematics, Physics, Geography, Art, French, Cookery and English Literature. Two clear themes emerged from the work across most subjects - an increasing emphasis on real-world contexts in more recent years and an increasing choice of topic areas and question/component options available to candidates.

Download
All the right letters – just not necessarily in the right order. Spelling errors in a sample of GCSE English scripts

Elliott, G. and Johnson, N. (2009). All the right letters – just not necessarily in the right order. Spelling errors in a sample of GCSE English scripts. Research Matters: A Cambridge Assessment publication, 7, 26-31.

For the past ten years, Cambridge Assessment has been running a series of investigations into features of GCSE English candidates’ writing – the Aspects of Writing study (Massey et al., 1996, Massey et al., 2005). The studies have sampled a fragment of writing taken from the narrative writing of 30 boys and 30 girls at every grade at GCSE. Features investigated have included the correct and incorrect use of various forms of punctuation, sophistication of vocabulary, non-standard English, sentence types and the frequency of spelling errors. This paper provides a more detailed analysis of the nature of the spelling errors identified in the sample of work obtained for the Aspects of Writing project from unit 3 (Literary heritage and Imaginative Writing) of the 2004 OCR GCSE examination in English. Are there certain types of spelling error which occur more frequently than others? Do particular words crop up over and over again? How many errors relate to well-known spelling rules, such as “I before E except after C”?

The study identified 345 spelling errors in 11,730 words written, and these were reported in Massey et al. (2005), with a comparison by grade with samples of writing from 1980, 1993 and 1994. It was shown that a considerable decline in spelling in the early 1990s (compared with 1980) had been halted, and at the lower grades, improved.

Since then, we have conducted a detailed analysis of the 345 misspelled words to see if there is evidence of particular types of error. Each misspelling has been categorised, and five broad types of error identified. These are sound-based errors, rules-based errors, errors of commission, omission and transposition, writing errors and multiple errors. This paper will present a detailed examination of the misspellings and the process of developing the categorisation system used. A number of words – woman, were, where, watch(ing), too and the homophones there/their and knew/new are identified as being the most frequently misspelled words. Implications for the findings upon teaching and literacy policy are discussed.

Download
Statistical Reports

The Statistics Team (2009). Team Statistics Reports. Research Matters: A Cambridge Assessment publication, 7, 31.

The ongoing Statistics Reports Series provides statistical summaries of various aspects of the English examination system, such as trends in pupil uptake and attainment, qualifications choice, subject combinations and subject provision at school. This article contains a summary of the most recent additions to this series.

Download
De-mystifying the role of the uniform mark in assessment practice: concepts, confusions and challenges

Gray, E. and Shaw, S. (2009). De-mystifying the role of the uniform mark in assessment practice: concepts, confusions and challenges. Research Matters: A Cambridge Assessment publication, 7, 32-37.

The search for an adequate conceptualisation of the Uniform Mark Scale (UMS) is a challenging one and it is clear that there is a need to broaden current discussions of the issues involved. This article marks an attempt to demystify the UMS; its conception and operation. Although the article assumes a basic appreciation of the terminology and processes associated with the examination system, it explicates through a number of case study scenarios, the contexts in which it is appropriate to employ UMS, describes any necessary computations arising from different specifications and assessment scenarios, and addresses some of the potential challenges posed by the calculation of grades for unitised specifications. A specification here refers to a comprehensive description of a qualification and includes both obligatory and optional features: content, and any performance requirements. If a specification is unitised, the constituent units can be separately delivered, assessed and certificated. Having a clear and well-articulated position on the underlying theory of UMS is necessary to demonstrate transparency with regard to the estimation of aggregate performance on unitised assessments and to support any claims we wish to make about the reporting process. It is hoped that the issues addressed here will make a positive contribution to the widening nature of the UMS debate (both within and beyond Cambridge Assessment) more generally, and of the understanding, operation and employment of UMS, in particular.

Download
The CIE Research Agenda

The CIE Research Agenda (2009). Shaw, S. Research Matters: A Cambridge Assessment publication, 7, 37-39.

A summary of the activities and research of CIE's research team.

Download
Research News

The Research Division (2009). Research News. Research Matters: A Cambridge Assessment publication, 7, 40.

A summary of recent conferences and seminars, and research articles published since the last issue of Research Matters.

Download

A101: Introducing the Principles of Assessment

CPD accredited online courses

First cohort receives advanced award from the Assessment Network

Become a Member and join the debate

Our publications

Contents

Contents

Research Matters 7 - Foreword

Research Matters 7 - Editorial

Keynote presentations to the International Association for Educational Assessment (IAEA) 2008 Annual Conference

Grading examinations using expert judgements from a diverse pool of judges

Using ‘thinking aloud’ to investigate judgements about A-level standards: Does verbalising thoughts result in different decisions?

Can emotional and social abilities predict differences in attainment at secondary school?

Assessment instruments over time

All the right letters – just not necessarily in the right order. Spelling errors in a sample of GCSE English scripts

Statistical Reports

De-mystifying the role of the uniform mark in assessment practice: concepts, confusions and challenges

The CIE Research Agenda

Research News