Dr Catherine Smith

Dr Catherine Smith

Department of Theology and Religion
Research Fellow and Technical Officer
Institute for Textual Scholarship and Electronic Editing

Contact details

Dept. of Theology and Religion
University of Birmingham
B15 2TT

I am a Research Fellow and Technical Officer in the Institute for Textual Scholarship and Electronic Editing. My research interests focus on the use of electronic tools to analyse texts: I have been involved in writing software and applying these methods to texts as diverse as the New Testament, the writings of Charles Dickens and email communications. At ITSEE, I have helped to develop the Virtual Shakespeare Archive and contributed to the Workspace for Collaborative Editing, including the design of a user interface for CollateX currently in use by the International Greek New Testament project and the Institute for New Testament Textual Research in Münster. My present responsibilities on the ERC-funded COMPAUL project involve the design of databases and analytical interfaces to gather and compare different forms of biblical text in commentaries and other early Christian writings. I have a particular specialism in XML and related technologies and do most of my development in Python and Javascript.


  • MSc in Computer Science (Birmingham)
  • PhD in Biblical Studies (Birmingham)
  • MA in Biblical and Theological Studies (Roehampton)
  • BA in Biblical and Theological Studies with Psychology (Roehampton)


Dr Catherine Smith has worked on a variety of projects applying digital methods to textual analysis. Her doctoral work, “Casting out Demons and Sowing Seeds: A Fresh Approach to the Synoptic Data from the Perspective of Systemic-Functional Linguistics” built on her earlier discourse analysis studies of Revelation and the Didache. She has been a research partner in the OpenText.org corpus of Hellenistic Greek since its initiation in 2001. Her postdoctoral positions include working on metaphor detection at the University of Birmingham, software development for the Incunabula Short Titles Catalogue, the Archive Hub and Cheshire3 for Corpus Linguistics at the University of Liverpool, and Health communication research, and corpus stylistic analysis of Dickens at the University of Nottingham. She has presented her work at national and international conferences, including regular appearances at the Society of Biblical Literature Annual Meeting in the USA and Corpus Linguistics in the UK. She has served as webmaster of the British New Testament Society and is on the editorial board of Biblical and Ancient Greek Linguistics.


  • The Electronic Book (MA module)
  • Python for Corpus Linguistics

Postgraduate supervision

  • Digital Humanities
  • Computer tools and the New Testament

Doctoral research

PhD title
Casting out Demons and Sowing Seeds: A Fresh Approach to the Synoptic Data from the Perspective of Systemic-Functional Linguistics


  • XML encoding, transformation and analysis
  • Computer-assisted manuscript collation
  • Interfaces for editing complex textual traditions
  • Corpus linguistics

Other activities

Catherine Smith has contributed to the following projects:


  • Houghton, H.A.G., and C.J. Smith, “Digital Editing and the Greek New Testament” in Claire Clivaz, Paul Dilley and David Hamidović (edd.), The Ancient Worlds in A Digital Culture. Washington DC: Center for Hellenic Studies, 2015.
  • Houghton, H.A.G., M. Sievers and C.J. Smith, “The Workspace for Collaborative Editing.” Digital Humanities 2014 Conference Abstracts, EPFL–UNIL, Lausanne, Switzerland, 8–12 July 2014, 210–11. (Online at http://dharchive.org/paper/DH2014/Paper-224.xml).
  • Smith, C.J., S. Adolphs, K. Harvey and L. Mullany, ‘Spelling errors and keywords in born-digital data: a case study using the Teenage Health Freak Corpus.’ In Corpora 9.2 (2014) 137–154. DOI: 10.3366/cor.2014.0055
  • Mahlberg, M., C.J. Smith, and S. Preston, "Phrases in literary contexts. Patterns and distributions of suspensions in Dickens's novels." International Journal of Corpus Linguistics 18:1 (2013) 35–56. DOI: 10.1075/ijcl.18.1.05mah
  • Mahlberg, M., C.J. Smith, "Dickens, the suspended quotations and the corpus.” Language and Literature. 21.1 (2012), 51-65. DOI: 10.1177/0963947011432058
  • Mahlberg, M., C.J. Smith, 'Corpus Approaches to Prose Fiction: Civility and Body Language in Pride and Prejudice' in D. McIntyre and B. Busse (eds.), Language and Style, 2010. (Basingstoke: Palgrave MacMillan).
  • O’Donnell, M.B., and C.J. Smith, ‘A Discourse Analysis of 3 John’ in S.E. Porter and M.B. O’Donnell (eds.). The Linguist as Pedagogue: Trends in the Teaching and Linguistic Analysis of the Greek New Testament, 2009. (Sheffield: Sheffield Phoenix Press).
  • Smith, C.J., T. Rumbell, J.A. Barnden, M.G. Lee, S.R. Glasbey, & A.M. Wallington, ‘Affect and metaphor in an ICA: Further developments.’ In Procs. 7th International Conference on Intelligent Virtual Agents. Paris, 17-19 September 2007, 405-6. DOI: 10.1007/978-3-540-74997-4_61
  • Rumbell, T., C.J. Smith, J.A. Barnden, M.G. Lee, S.R. Glasbey, & A.M. Wallington, ‘Metaphor and affect detection in an ICA.’ In A. Paiva, R. Prada and R.W. Picard (Eds), Affective Computing and Intelligent Interaction: Second International Conference, ACII 2007, Lecture Notes in Computer Science, Vol. 4738. Springer. pp.747-748. DOI: 10.1007/978-3-540-74889-2_80
  • Smith, C.J. & M.B. O'Donnell. 'Interactive Corpus Annotation of Anaphor Using NLP Algorithms'. Proceedings from Corpus Linguistics Conference 2007. http://www.corpus.bham.ac.uk/conference/proceedings.shtml
  • Smith, C.J., T. Rumbell, J.A. Barnden, R.J. Hendley, M.G. Lee, A.M. Wallington, & L. Zhang. ‘Don't worry about metaphor: affect detection for conversational agents.’ Demo paper in Procs. Forty-Fifth Annual Meeting of the Association for Computational Linguistics, Companion Volume: Book II (Interactive Poster and Demonstration Sessions), June 2007, pp.33-36.


Catherine Smith is also currently working on a monograph based on her doctoral dissertation, “Casting out Demons and Sowing Seeds: A Fresh Approach to the Synoptic Data from the Perspective of Systemic-Functional Linguistics”.