Corpus software and related tools
The main corpus tools and related resources developed by researchers at Lancaster are:
BNCweb is a web-based client program for searching and retrieving
lexical, grammatical and textual data from the British National Corpus
(BNC). It relies on the Corpus Query Processor (CQP) of the IMS Open
Corpus Workbench to provide a convenient interface between the user and
the rich variety of annotated text in the 100-million word BNC in its
most recent incarnation, the XML-version.
This is the web front end to David Lee's BNC Index spreadsheet.
For an introduction to BNC Index, please see
David's web site.
This calculates Log-Likelihood values from a 2x2 contingency table.
LL is a more reliable alternative to the standard Pearson's chi-squared test, see Dunning (1993).
A corpus comparison and annotation tool incorporating CLAWS and USAS
in a web front end.
Part of speech tagging software for English.
Semantic tagger developed for English and extended to Finnish and Russian.