![]() |
The Lancaster Stemming Algorithm |
| Error Counting: | |||
| Home | It is possible to evaluate stemming by counting the numbers of two kinds of errors that occur during stemming, namely;
Using a sample file of grouped words, these errors are then counted. This evaluation method was described by Chris Paice in a paper entitled 'Method for Evaluation of Stemming Algorithms Based on Error Counting' JASIS 47(8), August 1996, 632-649. This method returns a value for an Under-Stemming (or Conflation) index; UI = Under-Stemming Index UI= 1 - CI Also a value for an Over-Stemming (or Distinctness) index; OI = Over-Stemming index OI= 1 - DI An Example The following examples use the so-called Truncate(n) stemmer, which simply retains the first n letters of the word, where n is a suitable integer, such as 4, 5 or 6. If the word has less than n letters to start with, it is returned unchanged. Consider the seven words shown in the first column of the table below, which fall into two distinct conceptual groups. We will look at the effect of applying Truncate(5) and Truncate(4) to these words. Truncate(N)
1 = A pair of groupable words with identical stems. This represents a successful stemming operation. In this case, the proportion of word pairs successfully merged is 10 out of 22, hence UI= 1 - (10/22) = 0.545.
1 = A pair of non-groupable words with non-identical stems. This represents a successful stemming operation. In this case, the proportion of word pairs which were correctly not merged to the same stem is 20 out of 20, hence OI = 1 - (20/20), OI = 0.
In this case, the proportion of word pairs correctly merged is 22 out of 22, hence UI= 1 - (22/22) = 0.
Here, the proportion of word pairs which were correctly not conflated is 0 out of 20, so that OI = 1 - (20/20) = 1. The purpose of this error-counting approach is that, although it is advantageous to have the index of terms compressed, this is only useful up to a point. This is because, as conflation becomes 'heavier', the merging of distinct concepts becomes increasingly frequent. At this point, small increases in Recall are gained at the expense of a major loss of Precision. One question mark over this approach concerns the validity of the grouped file against which the errors are assessed. In Paice (1996), these grouped files were constructed by human judgment, during scrutiny of sample word lists. |
||
| Introduction | |||
| Stemming Algorithms | |||
| Algorithm Implemenatations | |||
| Evaluation Techniques | |||
| Error Counting | |||
| Stemmer Strength | |||
| Inter-Stemmer Similarity | |||
| Evaluation Program | |||
| Resources | |||
| Bibliography | |||