Project description




Personnel




Funding




Project results and publications




10th SEBPC legacy workshop




Related work




SEBPC links at Durham

REVERE - REVerse Engineering of REquirements to support business process change


Project description

The proposed work addresses the problem of understanding the requirements for legacy IT evolution in organisations where the pace of business process change outstrips that of its supporting IT. Dynamic change has become commonplace for many organisations. This sometimes takes the form of profound changes to business processes and organisational structures. Unfortunately, when planning process change, the cost of redesigning support systems, operating procedures and documentation to reflect the new business processes is frequently underestimated. To a large extent, this is an organisational memory issue since much of the cost arises from the difficulty of recovering the motivating requirements for the existing processes and supporting systems. Without this information, the impact of proposed changes cannot be properly assessed. It is costly to elicit the information anew and, in any case, the key people who possess the knowledge may be unavailable. However, the information is often implicit in documents such as requirements specifications, operating manuals and data models. These typically provide definitions of functions, operating procedures and information structures but lack information about how the organisational structures and business processes provided their motivation. This information must be recovered to properly inform the process change and system redesign.

Principal objectives

The aim of the research proposed here is to improve the requirements analysis for legacy system evolution where the underlying business process has already changed. As far as we are aware, this is a unique focus which, despite addressing a real-world problem, has not been systematically addressed before. Our approach is to investigate the reverse engineering of requirements documents by the novel integration of techniques for the textual analysis of documentation; modelling of business processes; and modelling the organisational structures serving the business processes.

Lancaster Personnel:

In addition, Adelard will provide technical advice, documentary data, evaluation of the integrated method and piloting of the toolset resulting from the project.

Funding:

REVERE is funded within the SEBPC managed research programme of the Engineering and Physical Sciences Research Council (EPSRC). The project started in May 1998.

Project results and publications

  1. Rayson, P., Garside, R., and Sawyer, P. (1999). Recovering Legacy Requirements. In Proceedings of REFSQ'99. Fifth International Workshop on Requirements Engineering: Foundations of Software Quality, June 14-15 1999, Heidelberg, Germany. Published by University of Namur, pp. 49-54.
  2. Rayson, P., Garside, R., and Sawyer, P. (1999). Language engineering for the recovery of requirements from legacy documents. REVERE project report, Lancaster University, May 1999.
  3. Frequency profiles
  4. Rayson, P., Garside, R., and Sawyer, P. (2000). Assisting requirements engineering with semantic document analysis. In Proceedings of RIAO 2000 (Recherche d'Informations Assistie par Ordinateur, Computer-Assisted Information Retrieval) International Conference, Collège de France, Paris, France, April 12-14, 2000. C.I.D., Paris, pp. 1363 - 1371.
  5. Rayson, P., Garside, R., and Sawyer, P. (2000). Assisting Requirements Recovery from Legacy Documents. In Henderson, P. (ed.) Systems Engineering for Business Process Change: collected papers from the EPSRC research programme. Springer-Verlag, London, pp. 251 - 263.
  6. Rayson, P., Emmet, L., Garside, R., and Sawyer, P. (2000). The REVERE Project: Experiments with the application of probabilistic NLP to Systems Engineering. In proceedings of 5th International Conference on Applications of Natural Language to Information Systems (NLDB'2000). Versailles, France, June 28-30th, 2000.
  7. Rayson, P. and Garside, R. (2000). Comparing corpora using frequency profiling. In proceedings of the workshop on Comparing Corpora, held in conjunction with the 38th annual meeting of the Association for Computational Linguistics (ACL 2000). 1-8 October 2000, Hong Kong, pp. 1 - 6.
  8. Rayson, P., Emmet, L., Garside, R., and Sawyer, P. (2001). The REVERE Project: Experiments with the application of probabilistic NLP to Systems Engineering. In Bouzeghoub, M., Kedad, Z., and Métais, E. (eds.) Natural Language Processing and Information Systems. 5th International Conference on Applications of Natural Language to Information Systems (NLDB'2000). Versailles, France, June 2000. Revised papers. LNCS 1959. Springer-Verlag, Berlin Heidelberg, pp. 288 - 300. ISBN 3-540-41943-8.
  9. Sawyer, P., Rayson, P., and Garside, R. (2002) REVERE: support for requirements synthesis from documents. Information Systems Frontiers Journal. Volume 4, issue 3, Kluwer, Netherlands, pp. 343 - 353.
  10. Wmatrix software

ACL2000 publication:

Rayson, P. and Garside, R.
(2000). Comparing corpora using frequency profiling. In proceedings of the workshop on Comparing Corpora, held in conjunction with the 38th annual meeting of the Association for Computational Linguistics (ACL 2000). 1-8 October 2000, Hong Kong, pp. 1 - 6.

The pdf version is available for download here:


NLDB2000 publication:

Rayson, P., Emmet, L., Garside, R., and Sawyer, P. (2000). The REVERE Project: Experiments with the application of probabilistic NLP to Systems Engineering. In proceedings of 5th International Conference on Applications of Natural Language to Information Systems (
NLDB'2000). Versailles, France, June 28-30th, 2000.

Abstract:

Despite natural language's well-documented shortcomings as a medium for precise technical description, its use in software-intensive systems engineering remains inescapable. This poses many problems for engineers who must derive problem understanding and synthesise precise solution descriptions from free text. This is true both for the largely unstructured textual descriptions from which system requirements are derived, and for more formal documents, such as standards, which impose requirements on system development processes. This paper describes experiments that we have carried out in the REVERE project to investigate the use of probabilistic natural language processing techniques to provide systems engineering support.

The pdf version is available for download here:


SEBPC publication:

Rayson, P., Garside, R., and Sawyer, P. (2000). Assisting Requirements Recovery from Legacy Documents. In Henderson, P. (ed.) Systems Engineering for Business Process Change: collected papers from the EPSRC research programme. Springer-Verlag, London, pp. 251 - 263.

Abstract:

Business change is often accompanied by loss of continuity of experience. This has serious implications for the adaptation of an organisation's software since people with detailed knowledge of either the software or business processes may be unavailable to inform its adaptation. In many cases organisational memory will persist principally in the form of documents such as requirements specifications, operating procedures, regulatory standards, etc. These offer an important resource for informing what features of the software are redundant, need to be retained or can be reused. Exploiting this resource poses formidable problems, however, since it is often incomplete, poorly structured, poorly maintained and voluminous. This paper proposes that tools exploiting probabilistic natural language processing techniques offer the potential to ease these problems. Such tools are available, mature and have been proven in other domains.

The pdf version is available for download here:


RIAO 2000 publication:

Rayson, P., Garside, R., and Sawyer, P. (2000). Assisting requirements engineering with semantic document analysis. In Proceedings of RIAO 2000 (Recherche d'Informations Assistie par Ordinateur, Computer-Assisted Information Retrieval) International Conference, Collège de France, Paris, France, April 12-14, 2000. C.I.D., Paris, pp. 1363 - 1371.

Abstract:

Requirements engineering is the first stage in the software life-cycle and is concerned with discovering and managing a software system's services, constraints and goals. Requirements engineers frequently face the task of extracting domain knowledge and recovering requirements from large documents. This is needed to complement the often incomplete information elicited from the people who will use or otherwise have a stake in the system to be developed. The documents that have to be analysed may vary from structured documents, such as specifications of work processes, to unstructured, verbatim reports of interviews or workplace observations. This paper shows that tools exploiting natural language processing techniques, in particular semantic analysis, are able to assist in retrieval from these documents.

The draft pdf version is available for download here:


Frequency profiles:

In order to perform a statistical profiling analysis on the documents we need to obtain a baseline corpus which we can compare against. The first part of the technical report normative corpus has been built. We selected 135 files from the pure and applied science section of the BNC which were related to Information Technology (IT). Collectively, the files form a corpus of 1.7 million words, of which about 60% are news stories relating to IT. Three profiles are available to download: The part-of-speech tags in these profiles are from the CLAWS C7 tagset.

Project report:

Rayson, P., Garside, R., and Sawyer, P. (1999). Language engineering for the recovery of requirements from legacy documents. REVERE project report, Lancaster University, May 1999.

Abstract:

Legacy documents, such as requirements documents or manuals of business procedures, can sometimes offer an important resource for informing what features of legacy software are redundant, need to be retained or can be reused. This situation is particularly acute where business change has resulted in the dissipation of human knowledge through staff turnover or redeployment. Exploiting legacy documents poses formidable problems, however, since they are often incomplete, poorly structured, poorly maintained and voluminous. This report proposes that language engineering using tools that exploit probabilistic natural language processing (NLP) techniques offer the potential to ease these problems. Such tools are available, mature and have been proven in other domains. The document provides a review of NLP and a discussion of the components of probabilistic NLP techniques and their potential for requirements recovery from legacy documents. The report concludes with a summary of the preliminary results of the adaptation and application of these techniques in the REVERE project.

The pdf version is available for download here:


REFSQ'99 publication:

Rayson, P., Garside, R., and Sawyer, P. (1999). Recovering Legacy Requirements. In Proceedings of REFSQ'99. Fifth International Workshop on Requirements Engineering: Foundations of Software Quality, June 14-15 1999, Heidelberg, Germany. Published by University of Namur, pp. 49-54.

Abstract:

It is common for organisations to introduce substantial changes to their structure and operations in order to adapt to new business environments. This often confers legacy status on their software systems because they can't adequately support the new business processes. In this paper, we argue that it is necessary to recover the requirements of in-service legacy software to ensure that its evolution or replacement is properly informed by an understanding of what is redundant, what must be retained and what can be reused. Much of this information is often contained in documents. However, retrieval of the information is often difficult due to problems of completeness, quality and sheer volume. In the REVERE project we are integrating a number of techniques to provide a set of tools to help requirements engineers explore the documentation and reconstruct conceptual models of the software and business processes. At the core of this work is the exploitation of probabilistic NLP tools to provide a 'quick way in' to large, complex and imperfectly structured documents, saving much painstaking and error-prone manual effort.

The pdf version is available for download here:

logo CSEG Projects | Cooperative Systems Engineering Group | UCREL research group | Computing Department | Lancaster University


Last revision: 13th July 2000
Comments welcome.
Paul Rayson