Language full of character: computational stylistics, authorship and genre

University House 103
Tuesday 30 October 2018 (16:15-17:30)
  • English Language research seminar
  • Speaker: Mel Evans


As computational stylistic methods become established techniques for investigating questions of authorship, genre and chronology in literary and non-literary texts, there is an increasing recognition that the diagnostic power of common lexical criteria, such as most frequent words, n-grams, and zeta (mid-frequency) words, is not matched by an equivalent understanding of why such measures prove so effective. Within forensic linguistics, corpus-based methods using transparent measures, such as n-grams, have been used because they are easier to communicative to a jury. However, despite the richness of the associated stylistic disciplines, computational stylistic data is not necessary supported with similar insights into why such features might be characteristic of authorship, genre, or both.

In this talk, I discuss recent findings from a computational stylistic investigation into plays attributed to Aphra Behn, a 17th-century author; as part of the AHRC-funded project 'Editing Aphra Behn in the Digital Age'. The presentation outlines the challenges of developing a method and interpretation that is both linguistically robust and translatable to non-linguistic scholars. I present two case studies of plays attributed to Behn that highlight the particular difficulties of attempting to differentiate idiolectal style from genre. I suggest that quantitative readings of authorial and genre signals should incorporate functional range, as well as formal frequency, and that drawing on other, complementary linguistic frameworks, such as pragmatics, for analysis and interpretation, may be a valuable addition to the computational stylistic approach.