Corpus software and related tools


The main corpus tools and related resources developed by researchers at Lancaster are:

BNCweb

BNCweb is a web-based client program for searching and retrieving lexical, grammatical and textual data from the British National Corpus (BNC). It relies on the Corpus Query Processor (CQP) of the IMS Open Corpus Workbench to provide a convenient interface between the user and the rich variety of annotated text in the 100-million word BNC in its most recent incarnation, the XML-version.

BNC Web Index

This is the web front end to David Lee's BNC Index spreadsheet. For an introduction to BNC Index, please see David's web site.

LL Calculator

This calculates Log-Likelihood values from a 2x2 contingency table. LL is a more reliable alternative to the standard Pearson's chi-squared test, see Dunning (1993).

Wmatrix

A corpus comparison and annotation tool incorporating CLAWS and USAS in a web front end.

CLAWS

Part of speech tagging software for English.

Semantic tagger

Semantic tagger developed for English and extended to Finnish and Russian.

UCREL LOGO