• Kat Gupta’s research blog

    caution: may contain corpus linguistics, feminism, activism, LGB, queer and trans stuff, parrots, London


Very broadly, corpus linguistics involves collecting together a large amount of texts (known as a corpus), then using computer programs to look for patterns in them. These patterns might show that certain words might prefer or avoid other words (collocation), grammatical functions or categories (colligation), are associated with a particular semantic field (semantic preference) or have positive or negative connotations (semantic prosody).

Corpus linguistics has implications for many fields of linguistics, including but not limited to grammar, language teaching, applied linguistics, translation, stylistics, historical linguistics and forensic linguistics. It is increasingly used to investigate social discourses; for example, newspaper discourses associated with human migration (Baker and McEnery 2005, Baker 2006, Gabrielatos and Baker 2008), Catholic discourses on ‘work’ and ‘property’ (Teubert 2007) and discourses of moral panic surrounding bad language (McEnery 2005).


  1. Hi. I was admitted to Nottingham’s distance MA in Applied Linguistics and ELT and the program is set to start in September, this year. For my MA thesis, I’d like to do a corpus-driven analysis of ICE – Philippines corpora and I’m keen on looking at how idioms are used and more in Philippine English. I’d like to know if you know anyone at Nottingham who’s done research on idiomaticity. I was also admiited at Birmingham and at this point, my choice of either Nottingham or Birmingham will heavily depend on the availability of experts in this topic. I’m also doing Futurelearn’s Corpus Ling mooc by the way. Thanks.

Leave a Reply