Victoria Crisp

Victoria Crisp

My areas of research since joining Cambridge Assessment in 2000 have included, issues in question difficulty, question design and question writing, the effects of answer spaces on student responses, the use and purposes of annotations in examination marking, validity and validation, comparability issues and methods, and judgement processes in assessment. I have been involved in providing training for examiners and assessment professionals on issues in question writing both in the UK and abroad. More current areas of research relate to question writing processes, and question quality and comparability.

I have a BSc in Psychology from the University of Birmingham and an MA in Education from the Open University (conducted part-time whilst working at Cambridge Assessment). I later completed my PhD with the Institute of Education in London on the judgement processes involved in the assessment of coursework.

I am passionate about animal welfare and have volunteered for two animal rescue charities. I will soon be starting an MSc in Animal Behaviour in my spare time.

Publications

2018

Insights into teacher moderation of marks on high-stakes non-examined assessments

Crisp, V. (2018). Insights into teacher moderation of marks on high-stakes non-examined assessments. Research Matters: A Cambridge Assessment publication, 25, 14-20.

Where teachers assess their students’ work for high-stakes purposes, their judgements are standardised through professional discussions with their colleagues - a process often known as internal moderation. This process is important to the reliability of results as any inconsistencies in the marking standards applied by different teachers within a school department can be problematic.
This research used interviews, a questionnaire and observations of mock internal moderation sessions to explore internal moderation practices in the context of school-based work contributing to high-stakes assessments. Teachers’ discussions focused on the location and sufficiency of relevant evidence in student work. This, along with reference to the mark scheme and discussing the meaning of assessment criteria, is consistent with Cook and Brown’s (1999) notion of tacit knowledge being made explicit and helping to create and refine ways of knowing. Thus, internal moderation acts as professional development for teachers as well as providing quality assurance. Around a quarter of teachers appear not to have opportunities to participate in internal moderation. Moderation by teachers is reported to be infrequently influenced by group dynamics, is thought to remove any personal bias, and teachers tended to report that the process worked well.




 

A question of quality: Conceptualisations of quality in the context of educational test questions

Crisp, V., Johnson, M. and Constantinou, F. (2018) A question of quality: Conceptualisations of quality in the context of educational test questions. Research in Education (ahead of print).

2017

Exploring the relationship between validity and comparability in assessment
Crisp, V. (2017). Exploring the relationship between validity and comparability in assessment. London Review of Education, 15(3), 523-535.
Multiple voices in tests: towards a macro theory of test writing
Constantinou, F., Crisp, V., and Johnson, M. (2017).  Multiple voices in tests: towards a macro theory of test writing.  Cambridge Journal of Education (ahead of print).
How do question writers compose external examination questions? Question writing as a socio-cognitive process
Johnson, M., Constantinou, F. and Crisp, V. (2017). How do question writers compose external examination questions? Question writing as a socio-cognitive process. British Educational Research Journal (BERJ). 43(4), 700-719.
Spoilt for choice? Issues around the use and comparability of optional exam questions
Bramley, T. and Crisp, V. (2017). Spoilt for choice? Issues around the use and comparability of optional exam questions. Assessment in Education: Principles, Policy & Practice (ahead of print).
The judgement processes involved in the moderation of teacher-assessed projects.
Crisp, V. (2017). The judgement processes involved in the moderation of teacher-assessed projects. Oxford Review of Education 43(1), 19-37.

2016

How do question writers compose examination questions? Question writing as a socio-cognitive process
Johnson, M., Constantinou, F. and Crisp, V. (2016). Paper presented at the AEA-Europe annual conference, Limassol, Cyprus, 3-5 November 2016
'Question quality': The concept of quality in the context of exam questions
Crisp, V., Johnson, M. and Constantinou, F. (2016). Paper presented at the AEA-Europe annual conference, Limassol, Cyprus,3-5 November 2016
Writing questions for examination papers: a creative process?
Constantinou, F., Crisp, V. and Johnson, M. (2016). Paper presented at the 8th Biennial Conference of the European Association for Research in Learning and Instruction (EARLI) SIG 1 - Assessment and Evaluation, Munich, Germany, 24-26 August 2016

2015

Exploring the difficulty of mathematics examination questions for weaker readers
Crisp, V. (2015). Educational Studies, 41(3), 276-292.
Validity and comparability of assessment: how do these concepts relate?
Crisp, V. (2015) Paper presented at the British Educational Research Association (BERA) conference, Belfast, 15-17 September 2015

2014

Cultural and societal factors in high-performing jurisdictions

Crisp, V. (2014). Cultural and societal factors in high-performing jurisdictions. Research Matters: A Cambridge Assessment publication, 17, 29-41.

This article aims to provide insights into some of the cultural and societal contextual factors that influence education systems, using a number of high-performing jurisdictions (HPJs) as case studies. Consideration of the education and assessment systems of HPJs around the world has become a strategy of some interest during education reform and/or development. However, it has been noted that when doing so, societal and cultural features of the jurisdictions need to be considered (e.g. Elliott and Phuong-Mai, 2008; Alexander, 2010; Oates, 2010; Barber, Donnelly and Rizvi, 2012). The effects of a particular educational system may well be influenced by such factors, and as a result the system of one jurisdiction will not necessarily transfer the educational and achievement benefits if simply replicated in the jurisdiction undergoing change.

This article has been written using various secondary sources such as relevant articles, books and reports, newspaper articles, blog posts and other online material. A number of researchers have previously summarised and analysed the features of HPJs, including some of the cultural factors, to identify the possible reasons for the high achievements of students (at least on some of the measures that have been influential, such as PISA, TIMSS and PIRLS). Such work, key examples being the work of the Center on International Education Benchmarking and the Organisation for Economic Development (OECD) produced book Lessons from PISA for the United States: Strong Performers and Successful Reformers in Education, was particularly useful to the current article.

Six jurisdictions were chosen as the focus for this exploration of cultural and societal factors. The focus jurisdictions were: Alberta (Canada), Shanghai (China), Hong Kong, Singapore,Victoria (Australia), and New Zealand. A few additional jurisdictions for which cultural issues of interest were also noted during the literature review for this article are also mentioned briefly.

2013

Teacher views on the effects of the change from coursework to controlled assessment in GCSEs
Crisp, V. & Green, S. (2013) Educational Research and Evaluation: An International Journal on Theory and Practice, 19(8), 680-699.
Modelling question difficulty in an A Level Physics examination
Crisp, V. & Grayson, R. (2013) Research Papers in Education, 28(3), 346–372.
Criteria, comparison and past experiences: How do teachers make judgements when marking coursework?
Crisp, V. (2013) Assessment in Education: Principles, Policy & Practice, 20(1), 127-144

2012

A framework for evidencing assessment validity in large-scale, high-stakes international examinations
Shaw, S., Crisp, V. and Johnson, N. (2012) Assessment in Education: Principles, Policy and Practice, 19(2), 159-176
Applying methods to evaluate construct validity in the context of A level assessment
Crisp, V. and Shaw, S. (2012) Educational Studies, 38(2), 209-222
Controlled assessments in 14-19 Diplomas: Implementation and effects on learning experiences
Crisp, V. and Green, S. (2012) Educational Research and Evaluation, 18(4), 333-351
An investigation of rater cognition in the assessment of projects
Crisp, V. (2012) Educational Measurement: Issues and Practice, 31(3), 10-20
The effects of features of examination questions on the performance of students with dyslexia
Crisp, V., Johnson, M. and Novakovic, N. (2012) British Educational Research Journal, 38(5), 813-839.

2011

Item difficulty modelling: exploring the usefulness of this technique in a European context
Hopkin, R. and Crisp, V. Paper presented at the AEA-Europe annual conference, Belfast, November 2011.
Translating validation research into everyday practice: issues facing an international awarding body
Shaw, S. and Crisp, V. (2011). Paper presented at the 12th Annual Conference of the Association for Educational Assessment in Europe, Belfast, 10-12 November 2011.
The judgement processes involved in the assessment of project work by teachers
Crisp, V. (2011). Paper presented at the 12th Annual Conference of the Association for Educational Assessment in Europe, Belfast, 10-12 November 2011.
Modelling question difficulty in an A level Physics examination
Crisp, V. and Hopkin, R. (2011). British Educational Research Association, London
How valid is A level Physics? A wide-ranging evaluation of the validity of Physics A level assessments
Crisp, V. and Shaw, S. (2011). Paper presented at the British Educational Research Association annual conference, University of London Institute of Education, September 2011.
Practical issues in early implementation of the Diploma Principal Learning

Crisp, V. and Green, S. (2011). Practical issues in early implementation of the Diploma Principal Learning. Research Matters: A Cambridge Assessment publication, 12, 10-13.

This short article reports on some of the findings from an interview study conducted in the first year of implementation of the 14–19 Diplomas. The Diplomas were introduced by the Labour government as part of wider educational reforms (DfES, 2005a, 2005b). They were designed to prepare young people for the world of work or for independent study, and are intended to combine theoretical and applied learning, to provide different ways of learning, to encourage students to develop skills valued by employers and universities, and provide opportunities for students to apply skills to work situations in realistic contexts. They are also intended to contribute to ensuring that a wide range of appropriate learning pathways are available to young people, thus facilitating increased participation and attainment. The Diplomas are available at Levels 1, 2 and 3 and rather than being taught by an individual school or college, they are available through consortia consisting of a small group of schools and/or colleges working collaboratively. The Diploma is a composite qualification which is made up of the following elements: principal learning; generic learning; additional and specialist learning.

The current research focused on the Principal Learning (PL). The Principal Learning components are specific to a domain or ‘line of learning’. Learning through experience of simulated or real work contexts, through applying and practically developing skills, as well as theoretical learning, is emphasised. The PL components are assessed predominantly via assignments which are internally marked and externally moderated. Teaching of Diplomas in the first five ‘lines of learning’ began in September 2008 with a further five beginning in September 2009 and four in September 2010.

Six consortia running Phase 1 Diplomas in the first year of implementation took part in this research. At each consortium, one or more teachers and (in all but one case) a number of learners were interviewed about the learning that was occurring and various practicalities around implementation of the Diploma. This article reports on the latter.

An investigation of rater cognition in the assessment of projects
Crisp. V. (2011) American Educational Research Association (AERA) Annual Meeting, New Orleans
Tracing the evolution of validity in educational measurement: past issues and contemporary challenges

Shaw, S. and Crisp, V. (2011). Tracing the evolution of validity in educational measurement: past issues and contemporary challenges. Research Matters: A Cambridge Assessment publication, 11, 14-17.

Validity is not a simple concept in the context of educational measurement. Measuring the traits or attributes that a student has learnt during a course is not like measuring an objective property such as length or weight; measuring educational achievement is less direct. Yet, educational outcomes can have high stakes in terms of consequences (e.g., affecting access to further education), thus the validity of assessments is highly important.

Tracing this trajectory of evolution, particularly through key documents such as the validity/validation chapter in editions of Educational Measurement (Cureton, 1951; Cronbach, 1971; Messick, 1989; Kane, 2006) and the Standards of Educational and Psychological Testing (AERA, APA and NCME, 1954/1955, 1966, 1974, 1985, 1999) has been important to us as part of work to develop an approach to validation for general assessments.

The concept of validity is not a new one. Conceptualisations of validity are apparent in the literature from around the turn of the twentieth century, and since that time, they have evolved significantly. Earliest perceptions of validity were that of a static property captured by a single statistic, usually an index of the correlation of test scores with some criterion (Binet, 1905; Pearson, 1896; Binet and Henri, 1899; Spearman, 1904). Through various re-conceptualisations, contemporary validity theory generally sees validity as about the appropriateness of the inferences and uses made from assessment outcomes, including some consideration of the consequences of test score use. This article traces the progress and changes in the theorisation of validity over time and the issues that led to these changes.

2010

Towards a model of the judgement processes involved in examination marking
Crisp, V. (2010) Oxford Review of Education, 26, 1, 1-21
Judging the grade: exploring the judgement processes involved in examination grading decisions
Crisp, V. (2010) Evaluation and Research in Education, 23, 1, 19-35
How valid are A levels? Findings from a multi-method validation study of an international A level in geography
Shaw, S. and Crisp, V. (2010) Association for Educational Assessment (AEA) - Europe, Oslo
A new model of assessment for 14 to 19 year olds: What do students and their teachers think of Diploma assessments?
Crisp, V. and Green, S. (2010) Association for Educational Assessment (AEA) - Europe, Oslo
The effects of controlled assessments in the new Diplomas on students' learning experiences
Crisp, V. and Green, S. (2010) A paper presented at the Chartered Institute of Educational Assessors Annual Conference, London, October 2010.
How hard can it be? Issues and challenges in the development of a validation method for traditional written examinations
Crisp, V. and Shaw, S. (2010) International Association for Educational Assessment (IAEA) Conference, Bangkok
Developing and piloting a framework for the validation of A levels

Shaw, S. and Crisp, V. (2010). Developing and piloting a framework for the validation of A levels. Research Matters: A Cambridge Assessment publication, 10, 44-47.

Validity is a key principle of assessment, a central aspect of which relates to whether the interpretations and uses of test scores are appropriate and meaningful (Kane, 2006). For this to be the case, various criteria must be achieved, such as good representation of intended constructs, and avoidance of construct irrelevant variance. Additionally, some conceptualisations of validity include consideration of the consequences that may result from the assessment, such as effects on classroom practice. The kinds of evidence needed may vary depending on the intended uses of assessment outcomes. For example, if assessment results are designed to be used to inform decisions about future study or employment, it is important to ascertain that the qualification acts as suitable preparation for this study or employment, and to some extent predicts likely success.

This article reports briefly on the development, piloting and revision of a framework and methodology for validating general academic qualifications such as A levels. The development drew on previously proposed frameworks for validation from the literature, and the resulting framework and set of methods were piloted with International A level Geography. This led to revisions to the framework and use with A level Physics.

2009

Are all assessments equal? The comparability of demands of college-based assessments in a vocationally-related qualification
Crisp, V. and Novakovic, N. (2009) Research in Post-Compulsory Education, 14, 1, 1-18
Is this year's exam as demanding as last year's? Using a pilot method to evaluate the consistency of examination demands over time
Crisp, V. and Novakovic, N. (2009) Evaluation and Research in Education, 22, 1, 3-15
A proposed framework for evidencing assessment validity in large-scale, high-stakes international examinations
Shaw, S., Crisp, V. & Johnson, N. (2009) Association for Educational Assessment (AEA) - Europe, Malta
An exploration of the effect of pre-release examination materials on classroom practice in the UK
Johnson, M. and Crisp, V. (2009) Research in Education, 82, 47-59
What was this student doing? Evidencing validity in A level assessments
Shaw, S. and Crisp, V. (2009) British Educational Research Association (BERA) Annual Conference
Objective questions in science GCSE: Exploring question difficulty, item functioning and the effect of reading difficulties
Crisp, V. (2009) British Educational Research Association (BERA) Annual Conference

2008

Exploring the nature of examiner thinking during the process of examination marking
Crisp, V. (2008) Cambridge Journal of Education, 38, 2, 247-264
Tales of the expected: the influence of students’ expectations on question validity and implications for writing exam questions
Crisp, V., Sweiry, E., Ahmed, A. and Pollitt, A. (2008) Educational Research, 50, 1, 95-115
Judging the grade: an exploration of the judgement processes involved in A level examination grading decisions: BERA abstract
Crisp, V. (2008) British Educational Research Association (BERA) Annual Conference
Towards a methodology for evaluating the equivalency of demands in vocational assessments between colleges/training providers: IAEA abstract
Crisp, V. & Novakovic, N. (2008) International Association for Educational Assessment (IAEA) Conference, Cambridge
A case of positive washback: an exploration of the effect of pre-release examination materials on classroom practice: ECER abstract
Johnson, M. & Crisp, V. (2008) European Conference on Educational Research (ECER), Gothenburg
Are all assessments equal? The comparability of demands of college-based assessments in a vocationally-related qualification: BERA abstract
Crisp, V. and Novaković, N. (2008) British Educational Research Association (BERA) Annual Conference
Do assessors pay attention to appropriate features of student work when making assessment judgements?

Crisp, V. (2008).  Do assessors pay attention to appropriate features of student work when making assessment judgements? Research Matters: A Cambridge Assessment publication, 6, 5-9.

It is via the judgements of appropriate experts that assessment decisions are made, yet the actual thought processes involved during marking or grading are under-researched. This article draws on a study of the cognitive and socially-influenced processes involved in marking and grading A level geography examinations and pilot research into the marking of GCSE coursework by teachers. This data was used to investigate whether assessors pay attention to appropriate features of student work.

Verbal protocols of assessors’ thinking aloud whilst marking and grading work were collected and measures of marker agreement were obtained. The protocols were analysed in detail using appropriate coding schemes. From the behaviours identified, a tentative model of the marking process was developed, within which features of student work affecting judgements and social and personal reactions were identified. Whilst many features that appeared to influence evaluations were clearly focussed on the criteria intended for evaluation, some were not and could have influenced evaluations. Reactions to language use or legibility (when not assessing communication), personal or emotional responses and social responses sometimes occurred before marking decisions. The article discusses whether such responses could explain variations in marks from different examiners.

A review of literature regarding the validity of coursework and the rationale for its inclusion in the GCSE

Crisp, V. (2008). A review of literature regarding the validity of coursework and the rationale for its inclusion in the GCSE. Research Matters: A Cambridge Assessment publication, 5, 20-24.

Coursework was included in many GCSEs from their introduction in 1988 to increase the validity of assessment by providing wider evidence of student work and to enhance pupil learning by valuing skills such as critical thinking and independent learning (SEC, 1985). As the Secondary Examinations Council put it ‘above all, the assessment of coursework can correspond much more closely to the scale of values in this wider world, where the individual is judged as much by his or her style of working and ability to cooperate with colleagues as by the eventual product’ (SEC, 1985, p. 6).

The validity and reliability of the assessment of GCSE coursework has come under much discussion since its introduction with the focus of concerns changing over time. At the inception of the GCSE, the main threats anticipated were possible unreliability of teacher marking, possible cheating and concern that girls were favoured (see QCA, 2006a). Now, concerns about consistency across similar subjects, fairness and authenticity (including the issues of internet plagiarism and excessive assistance from others), tasks becoming overly-structured (and hence reducing learning benefits) along with the overall burden on students across subjects, have led to a review of coursework by the Qualifications and Curriculum Authority (QCA).

This article reviews relevant literature using the stages of assessment described by Crooks, Kane and Cohen (1996) to structure discussion of possible improvements to the validity of assessment as a result of including a coursework element within GCSE specifications and possible threats to validity associated with coursework.

Investigating the judgemental marking process: an overview of our recent research

Suto, I., Crisp, V. and Greatorex, J. (2008). Investigating the judgemental marking process: an overview of our recent research. Research Matters: A Cambridge Assessment publication, 5, 6-9.

This article gives an overview of a number of linked studies which explored the process of marking GCSE and A-level examination questions from a number of different angles. Key aims of these studies were to provide insights into how examiner training and marking accuracy could be improved, as well as reasoned justifications for how item types could be assigned to different groups of examiners in the future. The research studies combined several approaches, exploring both the information that people attend to when marking items and the sequences of mental operations involved. Examples include studies that used the think-aloud method to identify the cognitive marking strategies entailed in marking student responses, or to explore the broader socio-cognitive influences on the marking process. Other examples explored the relationship between cognitive marking strategy complexity and marking accuracy.

This article brings together the findings from these various related studies to summarise the influences and processes that have been identified as important to the marking process from the research conducted so far.

2007

‘The demands of exam syllabuses and question papers’, in: P. Newton, J. Baird, H. Goldstein, H. Patrick, and P. Tymms (Eds.)
Pollitt, A., Ahmed, A. and Crisp, V. (2007) Techniques for monitoring the comparability of examination standards. London: QCA.
The use of annotations in examination marking: opening a window into markers’ minds
Crisp, V. and Johnson, M. (2007) British Educational Research Journal, 33(6), 943–961
The effects of features of GCSE questions on the performance of students with dyslexia
Crisp, V., Johnson, M. and Novakovic, N. (2007) British Educational Research Association (BERA) Annual Conference
Do assessors pay attention to appropriate features of student work when making assessment judgements?
Crisp, V. (2007) International Association for Educational Assessment (IAEA) Conference, Azerbaijan

2006

Does a gap to fill lead to gap-filling?
Crisp, V. (2006) British Educational Research Association (BERA) Annual Conference
Can a picture ruin a thousand words? The effects of visual resources in exam questions
Crisp, V. and Sweiry, E. (2006) Educational Research, 48, 2, 139-154

2005

Constructing meaning from school mathematics texts: potential problems and the effect of teacher mediation
Crisp, V. (2005) British Educational Research Association (BERA) Annual Conference
The use of annotations in examination marking: opening a window into markers' minds
Crisp, V. and Johnson, M (2005) British Educational Research Association (BERA) Annual Conference
The PePCAA project: Formative scenario-based CAA in psychology for teachers
Crisp, V. and Ward, C. (2005) Ninth International Computer Assisted Assessment Conference, Loughborough University

2004

Could Comparative Judgements Of Script Quality Replace Traditional Marking And Improve The Validity Of Exam Questions?
Pollitt, A. and Crisp, V. (2004) British Educational Research Association (BERA) Annual Conference

2003

Can a picture ruin a thousand words? Physical aspects of the way exam questions are laid out and the impact of changing them.
Crisp, V. and Sweiry, E. (2003) British Educational Research Association (BERA) Annual Conference

2002

Tales of the Expected: The Influence of Students’ Expectations on Exam Validity
Sweiry, E., Crisp, V., Ahmed, A. and Pollitt, A. (2002) British Educational Research Association (BERA) Annual Conference

Research Matters

Research Matters

Research Matters is our free biannual publication which allows us to share our assessment research, in a range of fields, with the wider assessment community.