Resources

Centre for Corpus Research

The Centre for Corpus Research at Birmingham has a wide range of corpus resources and tools for research purposes.

Corpus and related resources developed at CCR

Bank of English

The Bank of English is now hosted on CQPWeb, on a Birmingham server. This allows Birmingham staff and students access to the corpus and other corpora through the same interface. It requires a bham.ac.uk email address for you to be able to register.

Please note that the previous version of the Bank of English which had been hosted on the Titania server is now closed and is no longer available. The Bank of English at Birmingham is now only available on CQPweb.

British Sign Language Corpus Project

Access to some of the video data and ELAN annotation files that form the British Sign Language (BSL) Corpus based at University College London is available here. Creating the British Sign Language Corpus was a joint venture involving five UK universities during 2008-2011, led by Dr Adam Schembri who is now based here at the University of Birmingham and who continues to work on corpus-based approaches to the study of BSL linguistics.

CLiC Dickens

CLiC is a web application for the corpus linguistic analysis of Dickens’s novels and other literary texts. The web app is being developed as part of the CLiC Dickens project, a collaboration between the University of Birmingham and the University of Nottingham, funded by the AHRC. Please see the CLiC Dickens project site.

Cobuild Grammar Patterns 1: Verbs

This is the full online version of a publication from the Cobuild Project.

Conceptualising Transitivity Networks through Pattern-Based Constructions

The Conceptualising Transitivity Networks through Pattern-Based Constructions project has brought together insights from three approaches to the lexis and grammar of English. From Corpus Linguistics and lexicography comes the concept of Pattern Grammar; from Cognitive Linguistics comes Construction Grammar; and from Systemic-Functional Linguistics comes the concept of the network as a model of choices in semantics and in lexicogrammar.

English Constructicon

The English Constructicon project aims to build a comprehensive inventory of grammatical constructions of the English language, following the principles of Construction Grammar theory.

EuroCoAT

EuroCoAT (European Corpus of Academic Talk) provides transcripts of academic conversations between undergraduate Erasmus students (L1 Spanish) and their lecturers at different host universities. The EuroCoAT project is a collaboration between the Universities of Extremadura, Birmingham, Limerick, Dalarna and VU Amsterdam.

Other resources available at CCR

BNCWeb

A web-based client program for searching and retrieving lexical, grammatical and textual data from the British National Corpus (BNC). It requires a bham.ac.uk email address for you to be able to register. This gives access to the British National Corpus.

CLAWS Part of Speech Tagger

The Centre has a licence for the CLAWS tagger (UCREL, Lancaster). Staff or students who are interested in POS-tagging large quantities of data for research should contact Paul Thompson.

Sketch Engine

Institutional access to the Sketch Engine interface is available on any computer on the University network (but not outside the network). Approximately 160 corpora are included.

WMatrix

The Centre also has a licence for the WMatrix suite of semantic and POS annotation and analysis tools developed by Paul Rayson (UCREL, Lancaster). Staff or students who are interested in using WMatrix for research should contact Paul Thompson.

Wordbanks Online

Institutional access to the Wordbanks Online service is available on any computer on the University network (but not outside the network). Wordbanks Online is the HarperCollins interface for the 550 million word version of the Bank of English.

Activities and events

Corpus Linguistics Summer School

We organise the Corpus Linguistics Summer School annually. The webpages for the most recent summer schools are available below:

Sinclair Lecture

We also organise the Sinclair Lecture annually. The webpages for recent lectures are available below:

Archived online proceedings from previous Corpus Linguistics conferences