Hi and welcome to my academic home page. I am a Security Lancaster Research Fellow in the School of Computing and Communications at Lancaster University. I have been at Lancaster since 2002, completing my undergraduate degree in 2006 and my Ph.D. in 2011. I was a Research Assistant at Lancaster from 2009, working on several projects, including Isis and Spatial Humanities. I started my research fellowship in November 2012, funded by the Faculty of Science and Technology, performing cyber security and NLP research in the Security Lancaster centre. I currently co-coordinate the UCREL Corpus Research Seminar and am a member of the CREME (Corpus Research in Early Modern English) interdisciplinary research group.
My primary research area is Natural Language Processing (NLP), with a particular focus on dealing with the difficulties arising from noisy unstructured text. During my current research fellowship I will be conducting research involving using NLP techniques in a cyber security setting, bringing cutting-edge NLP research into real-word security applications. I will be primarily focused on developing solutions for problems associated with online communities (bulletin boards, forums), social networks (Facebook, Twitter) and instant messaging services (Skype, WhatsApp). This will involve, for example, developing deception and multiple personae detection techniques to assist in countering the use of fake profiles, e.g. adults masquerading as children for the purposes of grooming. The characteristics of online texts, e.g. the abundance of irregular language and its multi-lingual nature, pose significant barriers to many NLP methods, which mainly rely on standard written texts. A primary aim of my research is building robust NLP tools, which are able to cope with, and take advantage of, these features.
The focus of my Ph.D. research (under the supervision of Paul Rayson) was the problems and characteristics of spelling variation in historical corpora (particularly Early Modern English). The research resulted in the production of an interactive piece of software named VARD 2. This tool uses methods from modern spell checkers, such as edit distance algorithms, phonetic matching techniques and letter replacement heuristics, to find candidate replacements for spelling variants found within historical texts. Another tool, DICER, was also developed to assist in the exploration of the characteristics and trends of spelling variation.
Lancaster Research Portal
My publications already include several papers and presentations as well as a co-authored book.
If you'd like to contact me about any matter my details are in the contact section.