Top GCSE news stories

Top GCSE news stories

August 2018


More than half a million students in England and Wales will receive their GCSE results this month. The release of results, as well as the exams themselves, are well established in the media calendar and are widely reported upon. The Global Database of Events Language and Tone (GDELT) Project monitors and analyses the world's news media, extracting and storing information, such as key actors, themes and sentiment, from each piece of content. Access to the GDELT Project is via an open platform, allowing anyone to use its many datasets which total trillions of data points. In this Data Byte we use this exciting resource to look at the volume and tone of articles which mentioned GCSEs over the past year and highlight the stories that received the most media attention.

What does the chart show?

The data presented in these charts was extracted from the GDELT Project using the GDELT V2 Full Text API.

The top chart shows the ‘tone’ or sentiment expressed by articles published online from July 2017 to July 2018 which mentioned the word ‘GCSE’. The tone was determined by the GDELT Project and is presented as a relative measure of the sentiment expressed by the articles published per day to the total sentiment expressed in all the articles published over the whole year. Presenting the tone in this way enables clear comparisons of daily sentiment within the year, as the effect of a few strongly polar articles is minimised. Values greater than zero indicate that the articles published that day contained more words with a positive emotional connotation, while values below zero indicate greater negativity in coverage. Tone values close to zero indicate the use of non-emotive language, or positive and negative elements which cancel each other out.

The lower chart shows the number of articles published per day which mentioned the word ‘GCSE’. Days in which more than 100 articles were published are highlighted in grey on both charts. Hovering over the grey bars displays the three most frequently occurring bigrams 1 in the headlines of the articles published on those days.

Applying computational methods to textual data sources allows for the rapid processing of large datasets. This can provide insight into collections of data which would be impractical to analyse manually. However, these techniques are not without their limitations, for example, they may not be sensitive to nuances in language, and so may interpret the sentiment of an article differently to a human.

Additionally, in searching for articles which contain the word ‘GCSE’, our dataset includes articles which contain a passing reference to GCSEs, as well as those concentrated on GCSEs.

Why is the chart interesting?

Throughout most of the year the overall tone per day of articles mentioning GCSEs is close to neutral. However, unsurprisingly, around the release of results articles become strongly positive as students’ success is celebrated.

In total, we retrieved more than 9900 articles which mentioned the word ‘GCSE’. The three domains which published the most articles were, and The median number of articles published per day was 18, showing that GCSEs are habitually referred to in the media. However, there are significant peaks on certain days and as might be expected the most articles mentioning GCSEs were published on the 24th and 25th August 2017, the day before, and the day of, the release of GCSE results. There were also large numbers of articles published on the 16th and 17th August, the day before and the day of A level results release. Coverage on these days was likely to include a reference to the forthcoming GCSE results. Beyond results release dates, there were spikes in the numbers of articles at several points throughout the year. The main stories on days in which more than 100 articles mentioning GCSEs were published were:

25/10/2017: MPs sent a letter to the vice chancellors of Oxford and Cambridge urging them to take action over social mobility.

28/11/2017: Alan Milburn made comments on a Social Mobility Commission report. Additionally, Ofqual changed the rules around GCSE Computer Science non-exam assessment following fears of answers being shared online.

25/01/2018: The 2017 secondary school performance tables in England were published by the Department for Education.

22/02/2018: The What Kids are Reading report 2017 was published. This study showed that some GCSE students are not reading suitably challenging books, and some may have a reading age of 13 or lower.

09/03/2018: The speech that Carl Ward, president of the Association of School and College Leaders, was to give to the union’s annual conference discussing education reform was previewed. This was the day in which the tone of coverage was most negative.

09/06/2018: Coverage following claims by The Guardian that the Army had been targeting recruitment material at “stressed and vulnerable” 16-year-olds via social media around GCSE results day. Overall, articles published this day were more negative in tone.

04/07/2018: The Advertising Standards Authority (ASA) ruled that a company had broken ASA rules which ban campaigns for products high in fat, salt or sugar directed at children. One of the adverts the ASA banned was that of a cartoon character celebrating their GCSE results. This was another day in which the coverage was negative in tone.

Articles in the media may affect the general public’s view of exams and their standards and so this is an area of interest for Cambridge Assessment. This Data Byte gives an insight into media coverage of GCSEs over the last year. Our colleague Matthew Carroll has analysed articles focussing on GCSEs published in the British press from 1988 until 2017. Using a variety of custom methods, he has examined how the words used, sentiments expressed and topics covered have changed in that time and will be presenting his findings at the 44th International Association for Educational Assessment annual conference in September.

Further information

Further details of the GDELT Project can be found at

1. A bigram is a consecutive sequence of two words. To make these as informative as possible, we reduced the words in the headlines to their base form (a process known as lemmatisation) and removed common, non-informative words before identifying the bigrams.
Research Matters 32 promo image

Research Matters

Research Matters is our free biannual publication which allows us to share our assessment research, in a range of fields, with the wider assessment community.