The Lancaster Stemming Algorithm
  Evaluation Techniques:    
Home

Direct Assessment

The most primitive method for assessing the performance of a stemmer is to examine its behaviour when applied to samples of words - especially words which have already been arranged into 'conflation groups'. This way, specific errors (e.g., failing to merge "maintained" with "maintenance", or wrongly merging "experiment" with "experience") can be identified, and the rules adjusted accordingly. This approach is of very limited utility on its own, but can be used to complement other methods, such as the error-counting approach outlined later.

Information Retrieval

The most obvious method for comparing the usefulness of Stemmers for the field of IR is by their impact on IR performance, using a testing system and a ‘test collection' of documents, queries and relevance judgments. This involves substituting different Stemmers to see which gives the best results in terms of performance metrics such as Precision and Recall. However, there are problems with using such a technique for deciding which stemmer to use in an IR system, since the results are frequently indecisive. Thus, the ‘best' Stemmer may be different for different databases and different searches.

 

 

 

 

 

Introduction
Background Information
Stemming Algorithms
Algorithm Implemenatations
Evaluation Techniques
Error Counting
Stemmer Strength
Inter-Stemmer Similarity
Evaluation Program
Resources
Bibliography