Project report:
Rayson, P., Garside, R., and Sawyer, P. (1999).
Language engineering for the recovery of requirements from
legacy documents. REVERE project report, Lancaster University, May 1999.
Abstract:
Legacy documents, such as requirements documents or manuals of business
procedures, can sometimes offer an important resource for informing
what features of legacy software are redundant, need to be retained or
can be reused. This situation is particularly acute where business
change has resulted in the dissipation of human knowledge through staff
turnover or redeployment. Exploiting legacy documents poses formidable
problems, however, since they are often incomplete, poorly structured,
poorly maintained and voluminous. This report proposes that language
engineering using tools that exploit probabilistic natural language
processing (NLP) techniques offer the potential to ease these problems.
Such tools are available, mature and have been proven in other domains.
The document provides a review of NLP and a discussion of the
components of probabilistic NLP techniques and their potential for
requirements recovery from legacy documents. The report concludes with
a summary of the preliminary results of the adaptation and application
of these techniques in the REVERE project.
The pdf version is available for download here:
REFSQ'99 publication:
Rayson, P., Garside, R., and Sawyer, P. (1999).
Recovering Legacy Requirements.
In Proceedings of REFSQ'99.
Fifth International Workshop on Requirements Engineering:
Foundations of Software Quality, June 14-15 1999, Heidelberg, Germany.
Published by University of Namur, pp. 49-54.
Abstract:
It is common for organisations to introduce substantial changes to
their structure and operations in order to adapt to new business
environments. This often confers legacy status on their software
systems because they can't adequately support the new business
processes. In this paper, we argue that it is necessary to recover the
requirements of in-service legacy software to ensure that its evolution
or replacement is properly informed by an understanding of what is
redundant, what must be retained and what can be reused. Much of this
information is often contained in documents. However, retrieval of the
information is often difficult due to problems of completeness, quality
and sheer volume. In the REVERE project we are integrating a number of
techniques to provide a set of tools to help requirements engineers
explore the documentation and reconstruct conceptual models of the
software and business processes. At the core of this work is the
exploitation of probabilistic NLP tools to provide a 'quick way in' to
large, complex and imperfectly structured documents, saving much
painstaking and error-prone manual effort.
The pdf version is available for download here: