Tom Bramley

Tom Bramley

I joined Cambridge Assessment’s Research Division in 1995, and since that time have worked on projects covering most aspects of the assessment process, such as trying to understand the factors that make exam questions more or less difficult, or the features of mark schemes that make exam questions easier or harder to mark accurately. Much of my work has involved investigating the role that expert judgment and statistical information can play in mapping grading standards from one exam to another.

My current research interests include the application of Comparative Judgment methods to assessment, and trying to exploit item level data from past exams to help set grade boundaries on new exams.

I hold an MA in Experimental Psychology from the University of Oxford, and an MSc. in Operational Research from Lancaster University.

Outside of work I enjoy chess, tennis, gardening, and playing the piano.

Publications

2018

The effect of adaptivity on the reliability coefficient in adaptive comparative judgement.

Bramley, T. and Vitello, S. (2018). The effect of adaptivity on the reliability coefficient in adaptive comparative judgement.  Assessment in Education: Principles, Policy and Practice (ahead of print).

2017

The effect of adaptivity on the reliability coefficient in comparative judgement
Vitello, S. and Bramley, T. (2017). Presented at the annual conference of the Association for Educational Assessment - Europe, Prague, 9-11 November 2017.
Comparing small-sample equating with Angoff judgment for linking cut-scores on two tests
Bramley, T. and Benton, T. (2017). Presented at the 18th annual AEA Europe conference, Prague, 9-11 November 2017.
Some thoughts on the ‘Comparative Progression Analysis’ method for investigating inter-subject comparability
Benton, T. and Bramley, T. (2017). Cambridge Assessment Research Report. Cambridge, UK: Cambridge Assessment.
Some implications of choice of tiering model in GCSE mathematics for inferences about what students know and can do
Bramley, T. (2017). Some implications of choice of tiering model in GCSE mathematics for inferences about what students know and can do. Research in Mathematics Education, 19(2), 163-179.
Handbook of test development - review of Section 2
Bramley, T. (2017). Handbook of test development - review of Section 2.  Assessment in Education: Principles, Policy and Practice (ahead of print).
Spoilt for choice? Issues around the use and comparability of optional exam questions
Bramley, T. and Crisp, V. (2017). Spoilt for choice? Issues around the use and comparability of optional exam questions. Assessment in Education: Principles, Policy & Practice (ahead of print).

2016

Investigating experts' perceptions of examination question demand
Bramley, T. (2016). Paper presented at the AEA-Europe annual conference, Limassol, Cyprus, 3-5 November 2016
The effect of subject choice on the apparent relative difficulty of different subjects

Bramley, T. (2016). The effect of subject choice on the apparent relative difficulty of different subjects. Research Matters: A Cambridge Assessment publication, 22, 23-26.

Periodically there is interest in whether some GCSE and A level subjects are more ‘difficult’ than others.  Because students choose which subjects they take from a large pool of possible subjects, the matrix of data to be analysed contains a large amount of non-random missing data – the grades of students in subjects that they did not take.  This makes the calculation of statistical measures of relative subject difficulty somewhat problematic.  It is also likely to make subjects that measure something different to the majority of other subjects appear easier.  These two claims are illustrated in this article with a simple example using simulated data.

Maintaining test standards by expert judgement of item difficulty

Bramley, T. and Wilson, F. (2016). Maintaining test standards by expert judgement of item difficulty. Research Matters: A Cambridge Assessment publication, 21, 48-54.

This article describes two methods for using expert judgments about test items to arrive at a cut-score (grade boundary) on a new test where none of the items has been pre-tested.  The first method required experts to estimate the mean score on the new items from examinees at the cut-score, basing their judgments on statistics from items judged to be similar on previous tests.  The second method only required them to identify previous items that they deemed effectively identical in terms of difficulty.  Both methods were applied to an AS Chemistry unit.  Both methods gave results close to the actual cut-scores, but with the first method this may have been fortuitous since there were quite large differences between the judges’ individual results.  The results from the second method were quite stable when the criteria for defining effectively identical items were varied, suggesting this method may be more suitable in practice.

2015

The reliability of Adaptive Comparative Judgment
Bramley, T. and Wheadon, C. (2015) Paper presented at the AEA-Europe annual conference, Glasgow, Scotland, 4-7 November 2015
Maintaining standards by expert judgment of question difficulty
Bramley, T. and Wilson, F. (2015) Paper presented at the AEA-Europe annual conference, Glasgow, Scotland, 4-7 November 2015
Gender differences in GCSE
Bramley, T., Vidal Rodeiro, C.L. and Vitello, S. (2015) Cambridge Assessment Research Report. Cambridge, UK: Cambridge Assessment.
Volatility in exam results
Bramley, T. and Benton, T. (2015) Cambridge Assessment Research Report. Cambridge, UK: Cambridge Assessment.
The use of evidence in setting and maintaining standards in GCSEs and A levels

Benton, T. and Bramley, T. (2015). Cambridge Assessment Research Report. Cambridge, UK: Cambridge Assessment.

2014

Evaluating the adjacent levels model for differentiated assessment
Bramley, T. (2014) Paper presented at the AEA-Europe annual conference Tallinn, Estonia, 5-8 November 2014
On the limits of linking: experiences from England
Bramley, T., Dawson, A., & Newton, P. (2014) Paper presented at the 76th annual meeting of the National Council on Measurement in Education (NCME), Philadelphia, PA, 2-6 April 2014
Using statistical equating for standard maintaining in GCSEs and A levels
Bramley, T. and Vidal Rodeiro, C.L. (2014) Cambridge Assessment Research Report. Cambridge, UK: Cambridge Assessment.

2013

Prediction matrices, choice and grade inflation
Bramley, T. (2013) Cambridge Assessment Research Report. Cambridge, UK: Cambridge Assessment.
Maintaining standards in public examinations: why it is impossible to please everyone
Bramley, T. (2013) Paper presented at the 15th biennial conference of the European Association for Research in Learning and Instruction (EARLI), Munich, Germany, 27-31 August 2013
How accurate are examiners’ holistic judgements of script quality?
Gill, T. and Bramley, T. (2013). How accurate are examiners’ holistic judgements of script quality? Assessment in Education: Principles, Policy & Practice. 20(3), 308-324.
Problems in estimating composite reliability of ‘unitised’ assessments
Bramley, T., & Dhawan, V. (2013) Research Papers in Education. 28(1), 43-56

2012

Measurement and Construct need to be clarified first.
Bramley, T. (2012) Commentary on Newton, P.E. Clarifying the consensus definition of validity.  Measurement: Interdisciplinary Research and Perspectives. 10(1-2), 42-45.
What if the grade boundaries on all A level examinations were set at a fixed proportion of the total mark?
Bramley, T. (2012). Paper presented at the Maintaining Examination Standards seminar, London, 28 March 2012.

The effect of manipulating features of examinees' scripts on their perceived quality

Bramley, T. (2012). The effect of manipulating features of examinees' scripts on their perceived quality. Research Matters: A Cambridge Assessment publication, 13, 18-26.

Expert judgment of the quality of examinees’ work plays an important part in standard setting, standard maintaining, and monitoring of comparability.  In order to understand and validate methods that rely on expert judgment, it is necessary to know what features of examinees’ work influence the experts’ judgments.  The controlled experiment reported here investigated the effect of changing four features of scripts from a GCSE Chemistry examination: i) the quality of written English; ii) the proportion of missing as opposed to incorrect responses; iii) the profile of marks in terms of fit to the Rasch model; and iv) the proportion of marks gained on the subset of questions testing 'good Chemistry'.  Expert judges ranked scripts in terms of perceived quality.  There were two versions of each script, an original version and a manipulated version (with the same total mark) where one of the four features had been altered.  The largest effect was obtained by a combination of iii) and iv): increasing the proportion of marks gained on ‘good Chemistry’ items, and increasing the number of correct answers to difficult questions at the expense of wrong answers to easy questions. The implications of the findings for operational standard maintaining procedures are discussed.

2011

Investigating and reporting information about marker reliability in high-stakes external school examinations
Bramley, T. and Dhawan, V. (2011). Abstract of presentation at the annual European Conference on Educational Research (ECER), Berlin, Germany, September 2011.
Estimates of reliability at qualification level for GCSE and A level examinations
Bramley, T. and Dhawan, V. (2011). Paper presented at the British Educational Research Association annual conference, University of London Institute of Education, September 2011.
The interrelations of features of questions, mark schemes and examinee responses and their impact upon marker agreement.
Black, B., Suto, I., and Bramley, T. (2011) Assessment in Education: Principles, Policy and Practice (Special Issue), 18, 3, 295-318
The effect of changing component grade boundaries on the assessment outcome in GCSEs and A levels

Bramley, T. and Dhawan, V. (2011). The effect of changing component grade boundaries on the assessment outcome in GCSEs and A levels. Research Matters: A Cambridge Assessment publication, 12, 13-18.

GCSE and A level assessments are graded examinations, where grade boundaries are set on the raw mark scale of each of the units/components comprising the assessment. These boundaries are then aggregated in a particular way depending on the type of assessment to produce the overall grades for the assessment. This article reports a simple 'sensitivity analysis' determining the effect on assessment grade boundaries of varying the (judgementally set) key grade boundaries on the units/components by ±1 mark. Two assessments with different structures were used - a tiered ‘linear’ GCSE, and a 6-unit ‘modular’ A level.

Rank ordering and paired comparisons - the way Cambridge Assessment is using them in operational and experimental work

Bramley, T. and Oates, T. (2011). Rank ordering and paired comparisons - the way Cambridge Assessment is using them in operational and experimental work. Research Matters: A Cambridge Assessment publication, 11, 32-35.

In this article we describe the method of paired comparisons and its close relative, rank-ordering. Despite early origins, these scaling methods have been introduced into the world of assessment relatively recently, and have the potential to lead to exciting innovations in several aspects of the assessment process. Cambridge Assessment has been at the forefront of these developments and here we summarise the current ‘state of play'.

Estimates of reliability of qualifications
Bramley, T. and Dhawan, V. (2011) Ofqual, Ofqual/11/4826, Coventry

2010

Towards a suitable method for standard-maintaining in multiple-choice tests: capturing expert judgement of test difficulty through rank-ordering
Curcin, M., Black, B. & Bramley, T. (2010) Association for Educational Assessment (AEA) - Europe, Oslo
The interrelations of features of questions, mark schemes and examinee responses and their impact on marker agreement
Suto, I., Bramley, T. & Black, B. (2010) European Conference on Educational Research (ECER), Helsinki.
Evaluating the rank-ordering method for standard maintaining
Bramley, T. and Gill, T. (2010) Research Papers in Education, 25, 3, 293-317
Locating objects on a latent trait using Rasch analysis of experts’ judgments
Bramley, T. (2010) Probabilistic models for measurement in education, psychology, social science and health, Copenhagen

2009

The effect of manipulating features of examinees' scripts on their perceived quality

Bramley, T. (2009). Paper presented at the AEA-Europe annual conference, Balzan, Malta, 5-7 November 2009.

Standard-maintaining by expert judgement: using the rank-ordering method for determining the pass mark on multiple-choice tests
Curcin, M., Black, B. and Bramley, T. (2009) British Educational Research Association (BERA) Annual Conference
Mark scheme features associated with different levels of marker agreement

Bramley, T. (2009). Mark scheme features associated with different levels of marker agreement. Research Matters: A Cambridge Assessment publication, 8, 16-23.

This research looked for features of question papers and mark schemes associated with higher and lower levels of marker agreement at the level of the item rather than the whole paper. First, it aimed to identify relatively coarse features of question papers and mark schemes that could apply across a wide range of subjects and be objectively coded by someone without particular subject expertise or examining experience. It then aimed to discover which features were most strongly related to marker agreement, to discuss any possible implications for question paper (QP) and mark scheme (MS) design, and to relate the findings to the theoretical framework summarised in Suto and Nadas (2007).

2008

Alternative approaches to National Assessment at KS1, KS2 and KS3
Green, S., Bell, J.F., Oates, T. and Bramley, T. (2008)
Assessment Instruments over Time
Elliott, G., Black, B. Ireland, J., Gill, T., Bramley, T. Johnson, N. and Curcin, M. (2008) International Association for Educational Assessment (IAEA) Conference, Cambridge
Mark scheme features associated with different levels of marker agreement
Bramley, T. (2008). British Educational Research Association (BERA) Annual Conference.
How accurate are examiners’ judgments of script quality?
Gill, T. & Bramley, T. (2008) British Educational Research Association (BERA) Annual Conference
Investigating a judgemental rank-ordering method for maintaining standards in UK examinations
Black, B., & Bramley, T. (2008). Research Papers in Education, 23(3), 357-373.
Alternative Approaches to National Assessment at KS1, KS2 and KS3
Green, S., Bell, J. F., Oates, T. and Bramley, T. (2008)

2007

Paired comparison methods
Bramley, T. (2007) In: P. Newton, J. Baird, H. Goldstein, H. Patrick, and P. Tymms (Eds.), Techniques for monitoring the comparability of examination standards, 246-294. London: QCA

2006

Quality control of marking: Some models and simulations
Bell, J.F., Bramley, T., Claessen, M.J.A. and Raikes, N. (2006). Presented at the 32nd annual conference of the International Association for Educational Assessment (IAEA), Singapore, 21-26 May 2006.
Equating methods used in KS3 Science and English
Bramley, T. (2006) NAA technical seminar, Oxford

2005

Accessibility, easiness and standards
Bramley, T. (2005) Educational Research, 47, 2, 251-261
A Rank-Ordering Method for Equating Tests by Expert Judgement
Bramley, T. (2005) Journal of Applied Measurement, 6, 2, 202-223

2001

The Question Tariff Problem in GCSE Mathematics
Bramley, T. (2001) Evaluation and Research in Education, 15, 2, 95-107
MVAT 2000 - Statistical Report
Bramley, T. (2001)

1998

The effects of structure on the demands in GCSE and A Level questions
Pollitt, A., Hughes, S., Ahmed, A., Fisher-Hoch, H. and Bramley, T. (1998) London: QCA.
Assessing changes in standards over time using Thurstone Paired Comparisons
Bramley, T., Bell, J.F., and Pollitt, A. (1998) Education Research and Perspectives, 25, 2, 1-23
Investigating A-level mathematics standards over time
Bell, J.F., Bramley, T. and Raikes, N. (1998).  Investigating A level mathematics standards over time. British Journal of Curriculum and Assessment, 8, 2, 7-11.

1997

What makes GCSE examination questions difficult? Outcomes of manipulating difficulty of GCSE questions
Fisher-Hoch, H., Hughes, S. and Bramley, T. (1997) British Educational Research Association (BERA) Annual Conference
Standards in A level Mathematics 1986-1996
Bell J F., Bramley, T. and Raikes, N. (1997). Presented at the British Educational Research Association (BERA) annual conference, York, UK, 11-14 September 1997.

Research Matters

Research Matters

Research Matters is our free biannual publication which allows us to share our assessment research, in a range of fields, with the wider assessment community.