University of Birmingham campus, Old Joe Clock Tower

Tracking language use across time: 25 years of innovation

The 2026 Annual Sinclair Lecture: Professor Andrew Kehoe (Birmingham City University)
University of Birmingham campus, Old Joe Clock Tower
    • Date
      Monday, 6 July 2026, 18:00 - Wednesday, 6 May 2026, 19:00 (UK)
    • Format
      Online or in person
    • Location
      Lecture Room G03, ground floor, Alan Walters Building, University of Birmingham, Edgbaston, Birmingham, B15 2SB

Speaker: Professor Andrew Kehoe, Birmingham City University

In this lecture, I build upon Sinclair’s concept of a ‘monitor corpus’, which he described in Corpus, Concordance, Collocation as “one which has no final extent because, like the language itself it keeps on developing” (1991: 25). In the Research & Development Unit for English Studies at BCU we have, for the past 25 years, been tracking language change in an ever-growing corpus of UK news data which currently contains 2.1 billion words from 3.3 million articles published between 1989 and 2025.

The lecture offers linguistic insights gained from a series of collaborative research projects based on this news corpus. I describe the statistical tests and visualisations we have developed to detect upward or downward trends in word use, sudden jumps in frequency, and even seasonal variation. I demonstrate how it is often possible to explain such changes by finding corresponding changes in the collocational behaviour of the word in question over time. I also reflect on how, as our corpus has grown, we have had to refine our methods to account for the fact that, over a longer period, a word may exhibit more than one pattern of frequency change. News text has proven to be an abundant and reliable source of language data over time, enabling several decades of corpus analysis and innovation in language research.

 In the final part of the lecture, I describe recent work exploring the impact of Large Language Models (LLMs) and associated Artificial Intelligence (AI) tools on language innovation. This research asks ChatGPT to summarise a subset of articles from our news corpus and then to produce its own articles of similar lengths to the originals based only on the summaries. In comparing the AI-generated articles with the human-authored originals, I ask whether language will begin to ‘stagnate’, meaning that we will see a reduction in change over time due to the proficiency of LLMs in replicating the content used in their training.

Speaker biography

I have long-term research interests in diachronic linguistic study and in harnessing the web as a source of language data, beginning with the WebCorp project in 2000. More recently, I have explored a wider range of online sources, publishing on the language of eBay listings, online support forums, corporate posts on Twitter/X, and government messaging during the COVID pandemic. I have also carried out corpus pragmatic studies of swearing and apologies. My research has always been collaborative, and credit for the primary research discussed in this lecture in shared with Antoinette Renouf and Matt Gee, as well as with our statistical adviser Paul Davies.

The Annual Sinclair Lectures 

The Annual Sinclair Open Lecture honours the memory of Professor John Sinclair, who held the Chair of Modern English Language at the University of Birmingham from 1965 to 2000 and who was an internationally-renowned figure of influence in the world of Linguistics.

Location

Address
Lecture Room G03ground floorAlan Walters BuildingUniversity of BirminghamEdgbastonBirminghamB15 2SB