CL2017 Pre-conference workshop 6

University of Birmingham, Monday 24th July 2017, 09:30 - 12:30

Corpus approaches to health communication

cl2017 logoWorkshop convenors


09.30 - 09.45: Trajectories in health communication and corpus linguistic research Daniel Hunt, Gavin Brookes & Kevin Harvey
09.45 - 10.10: How to get from a 4 to a 5? An analysis of NHS patient feedback about GP surgeries Paul Baker
10.10 - 10.35: A corpus assisted study of self-reflection and self-monitoring in infertility blogs, UK press articles and clinic testimonials Karen Kinloch
10.45 - 11.15: Corpus stylistics and epilepsy: communicating experience Jennifer Sanchez-Davies
11.00 - 11.25: A corpus-based assessment of a diagnostic pain questionnaire  Elena Semino, Andrew Hardie & Joanna Zakrzewska
11.25 - 11.50: “The disease is the enemy, not you”: Identity construction on an online forum for people with dementia Annika Bailey
11.50 - 12.15: On ‘bad’ mothers and hormonal imbalances: Comparing discursive constructions of postnatal depression in lay, media and medical accounts Sylvia Jaworska & Karen Kinloch


How to get from a 4 to a 5? An analysis of NHS patient feedback about GP surgeries
Paul Baker (Lancaster University)

The British National Health Service (NHS) gathers a great deal of user feedback on its services. This plays a vital role in the design and improvement of contemporary health care services, with the results of such exercises routinely used to regulate standards and stimulate improvements in health care provision. This talk describes part of an ESRC-funded Knowledge Exchange project to examine online patient feedback. I discuss the analysis of an 8.5 million word corpus of feedback to the NHS about GP surgeries where every patient provided free text feedback as well as rating their experience on a scale of 1-5. The research question addressed is “What types of language are associated with the different points on the rating scale?” The keyword technique was adapted to identify lexis associated with each rating point and then concordance analysis was employed to make sense of how lexis contributed towards expression of different levels of satisfaction. As well as indicating behaviours which attract different ratings the analysis revealed different writing strategies at the level of discourse for different ratings. For example, feedback rated as 1 tended to contain more keywords indicative of a narrative discourse structure (use of reporting speech verbs like told and asked, as well as quote marks and the conjunction then). The analysis indicates nuanced differences in the language used in adjacent ratings and could be implemented for automatic identification of patient satisfaction as well as informing the NHS specifically about how a good practice can become an excellent one.

A corpus assisted study of self-reflection and self-monitoring in infertility blogs, UK press articles and clinic testimonials
Karen Kinloch (Edge Hill University)

Infertility occupies a problematic position as both a social and medical issue (Greil et al. 2010) and despite the prevalence of media, medical and personal texts which proliferate around it there is currently little linguistic research into this topic, particularly in the UK. This paper examines representations of people who experience infertility, in light of neoliberal expectations around health, personal responsibility and self-monitoring (Brown and Baker 2012). The data comprises three specially built corpora of texts on infertility including; UK newspaper articles containing the term infertility (NEWSCorpus), websites for fertility clinics (CLINICCorpus) and UK blogs written by people experiencing infertility (BLOGCorpus) in the period 2006-2012. Using a mixed methods, corpus assisted discourse analytical approach (Baker, 2006), Wordsmith Tools was used to elicit 100 thematically categorised keywords from each corpus and provide a starting point for closer concordance analysis of salient linguistic features. In the case of self-reflection, the keywords for close analysis are those classified by Biber (1988:105) as “private verbs”, “used for the overt expression of private attitudes, thoughts, and emotions”; FEEL, THINK, and KNOW. Analysis of these concordances in the BLOG Corpus revealed the ways in which people experiencing infertility use self-reflection to explore their experiences of infertility and monitor themselves in line with societal expectations about reproduction. Comparison with the NEWS and CLINIC corpora indicate that these linguistic markers of self-reflection are recontextualised for personal interest stories and patient testimonials.

Corpus stylistics and epilepsy: communicating experience
Jennifer Sanchez-Davies (University of Nottingham)

Epilepsy diagnosis can be a long and difficult process as it can often be dependent solely on the patients’ articulation of their symptoms.  Focal aware seizures or ‘auras’ are particularly known for their subjective nature and varied symptoms, often making them difficult to describe.  These symptoms can easily be misinterpreted resulting in a prolonged diagnosis or even misdiagnosis.  This calls for research to investigate whether it is possible for these types of seizures to be attributed a linguistic profile, with the long-term goal of contributing a referential resource for both practitioners and patients.   The key to deciphering this linguistic profile lies in corpus linguistics.

The last decade has seen the rise of the cross-disciplinary health humanities field.  In particular, the study of portrayal of different health conditions in literature is currently a topical and insightful area.  Using corpus stylistics, I develop a 'Character Tracking Model' which I use to conduct an innovative analysis of an extract from Gavin Extence’s prize-winning novel The Universe Versus Alex Woods wherein the protagonist experiences a focal aware seizure.  In my analysis I highlight the subtle linguistic patterns and textual cues that communicate the character’s seizure experience on a wide scale.  I use this as an example to demonstrate how experiential accounts can be used as a rich resource for exploring the individual experiences of epileptic seizures, that extends beyond the generic visual and behavioural symptoms relied upon by medical practitioners for diagnosis.

A corpus-based assessment of a diagnostic pain questionnaire
Elena Semino (Lancaster University), Andrew Hardie (Lancaster University) and Joanna Zakrzewska (Eastman Dental Hospital)

This paper presents the methods and findings of a corpus-based assessment of the McGill Pain Questionnaire (MPQ) – a widely used language-based questionnaire for the diagnosis of chronic pain (Melzack 1975). The MPQ includes 78 one-word descriptors – mostly present participles  (e.g. ‘stabbing’) with a minority of non-participial adjectives – arranged into 20 groups. Each group is intended to capture a particular quality of the pain experience, and includes between two and seven descriptors, arranged in order of intensity of that experience (e.g. ‘sharp’, ‘cutting’, ‘lacerating’). While the use of the MPQ is well established, there are concerns about its reliability, partly due to the selection and arrangement of the linguistic descriptors it includes (e.g. Main 2016). It was hypothesised that the strength of the association between each MPQ descriptor and the concept of pain may influence patients’ selections when completing the questionnaire. The choices made by 800 patients completing the MPQ at the Eastman Dental Hospital in London were correlated with the strength of the collocation between each descriptor in each group and the word ‘pain’ in the Oxford English Corpus. It was found that, for nine out of the 20 groups in the MPQ, the choice of descriptor is explicable largely or entirely in terms of the strength of the collocational link from the word ‘pain’ to that descriptor. This correlation can be explained within a view of participants’ language systems as a network with activation-spreading: due to the context of the task and the questionnaire, at the time of making the choice of one descriptor from each group, the lexical node for ‘pain’ is already highly activated; thus, in turn, the nodes for its strongly-connected collocates are also highly activated. The selection of a strong collocate in this context may, then, be akin to a ‘path of least resistance’ for the participant, as it represents a transition across a heavily-weighted link within the network; at the very least such choices cannot be assumed unproblematically to be determined by, or even necessarily to reflect, any non-linguistic qualia. From the perspective of the diagnostic use of the questionnaire, the findings of the analysis thus undermine the reliability of the intensity scales in those nine groups. It is therefore suggested that future versions of the questionnaire need to take into account evidence from large-scale corpus based studies of pain descriptors in English.

“The disease is the enemy, not you”: Identity construction on an online forum for people with dementia
Annika Bailey (University of Nottingham)

Dementia is often understood as an illness which causes a deterioration in an individual’s identity, to the point that the individual is sometimes described as just an ‘empty shell’ and a ‘living death’ (George, 2010). Viewing dementia in these terms negatively impacts the way caregivers and policy makers support people with dementia (Caddell and Clare, 2011), potentially restricting quality of life and reinforcing stigma surrounding the condition. This paper reports on a study of posts by people with dementia to an online forum, to examine the ways that participants construct their identity in relation to their illness. Identity is considered from a symbolic interactionist perspective, as something which one ‘does’ or ‘performs’ through language, rather than something one intrinsically ‘has’ (Butler, 1990). In this light, identity can be continually negotiated, adapting to an individual’s changing circumstances, and it is possible therefore to maintain a sense of identity despite the progression of dementia, and the negative and stigmatising attitudes that popularly surround dementia. Using a corpus-assisted approach to discourse analysis, the study examines six months of postings to the online forum, totalling 120,000 words. After a frequency and keyword survey of the data, the analysis takes a more qualitative approach through an exploration of collocates and concordances to examine the use of the signal terms associated with dementia. In particular, I focus on the discursive realisation of personal agency through pronouns and metaphor, considering how such linguistic resources are used to help determine the participants’ sense of identity. My analysis reveals how participants distance themselves from their condition in order to emphasise their preserved identity, while also acknowledging the dominance and control that their symptoms have over their sense of self. All told, this paper illustrates how a corpus-assisted approach to examining personal health discourse can help identify subtle patterns of communication, pointing, in this case, to the ways in which people with dementia resourcefully construct multiple identities for themselves, as they negotiate the daily challenges that living with the condition brings.

On ‘bad’ mothers and hormonal imbalances: Comparing discursive constructions of postnatal depression in lay, media and medical accounts
Sylvia Jaworska (Reading University) and Karen Kinloch (Edge Hill University)

Taking up the claim by Partington et al. (2013: 12) that ‘we are not deontologically justified in making statements about a relevance of a phenomenon observed in one discourse type unless […] we compare how the phenomenon behaves elsewhere’, our paper intends to demonstrate the benefits of using a comparative corpus-assisted discourse approach (CADS) in examining and comparing discourses around a mental health condition produced by lay people, medical authorities and the media. Such comparisons are important because they allow us to understand better the complex interplay between social and personal factors that constitute the lay experience of illness. When trying to give meaning to illness, people draw not just on consultations with medical professionals; they also engage with representations widely disseminated in society through traditional and increasingly digital media (Jones 2003).

Our study focuses on the constructions of postnatal depression (PND), which is a highly stigmatised condition and the leading cause of maternal death in the UK (Oates 2003, NHS 2011). Using a comparative corpus-assisted discourse approach, we examine constructions of PND in four discursive domains including 1) lay narratives sourced from Mumsnet, 2) documents about PND disseminated by clinicians for clinicians, 3) information by clinicians for lay people and 4) articles about PND from British newspapers. We begin our analysis by retrieving keywords from each domain. A selection of shared key keywords including ‘depression’ and ‘mother’ are then examined in-depth. Our results show the differences, similarities and absences in the ways in which PND is ‘talked about’ across the different contexts highlighting the specificity and subtleties of lay accounts. At the methodological level, our study highlights the relevance of using a comparative corpus-assisted discourse approach to foster our understanding of the role that social and medical discursive resources play in constituting the lay experience of health and illness.

Workshop summary

Since the late 1990s, an increasing number of health communication scholars have harnessed the opportunities afforded by corpus linguistic techniques to illuminate the linguistic character of health-related communication in a variety of contexts (Adolphs et al., 2004). In this time, corpus methods have granted health researchers the unique opportunity to examine the linguistic features of vast numbers of healthcare encounters, allowing them to establish a much more reliable picture of the common ways in which language is used in various clinical contexts (Harvey, 2013; Hunt and Harvey, 2015; Brookes and Harvey, 2016). Meanwhile, corpus linguistics’ commitment to analysing authentic language data has also led to such approaches being recognised increasingly as a means for supporting evidence-based communication (Brown, Crawford and Carter, 2006).

The purpose of this half-day workshop is to showcase the ways that corpus approaches have been applied to the study of health discourse to-date, while at the same time advancing a still relatively undersubscribed area of corpus-led enquiry. This half-day workshop will comprise a series of talks that demonstrate the powerful potential of corpus approaches to the study of health-related discourse which originates from a range of text types from within and beyond the clinic.

The session will begin with an introductory paper by the workshop convenors that charts the development of the field of corpus-based health research to-date and sets the scene for the talks to come. This will be followed by a series of 20-minute papers that push the boundaries of corpus approaches to health communication, whether through the methods employed, interdisciplinary and collaborative research or the exploration of underrepresented data. Should we receive a sufficient number of submissions, we will also invite some speakers (particularly those from postgraduate students) to bring research posters to display in the room in which the workshop will take place.


Adolphs, S., Brown, B., Carter, R., Crawford, P. and Sahota, O. (2004). Applying corpus linguistics in a health care context. Journal of Applied Linguistics, 1(1), 9-28.

Brookes, G. and Harvey, K. (2016). ‘Examining the discourse of mental illness in a corpus of online advice-seeking messages’. In L. Pickering, E. Friginal and S. Staples (eds.), Talking at work: corpus-based explorations of workplace discourse. New York: Palgrave.

Brown, B., Crawford, P. and Carter, R. (2006). Evidence Based Health Communication. Maidenhead: Open University Press.

Hunt, D. and Harvey, K. (2015). ‘Health Communication and Corpus Linguistics: Using Corpus Tools to Analyse Eating Disorder Discourse Online’. In P. Baker and T. McEnery (eds.), Corpora and Discourse: Integrating Discourse and Corpora. Basingstoke: Palgrave, pp. 134-154

Culture and collections

Schools, institutes and departments

Services and facilities