Evaluation studies in the CSCW literature

NB. Much of the text of this document was written for a paper by Plowman, Rogers and Ramage (1995), although only small parts appeared in the final paper. Susi Ross and I have also written a document on methods used in the literature.
The question of CSCW evaluation is a vexed and difficult one (Ross et al., 1995): what methods should be used? who should participate in the evaluation? where should the evaluation take place? One class of work studies undertaken within CSCW seeks to evaluate systems (that is, to determine their efficacy for their appointed tasks) by observing them in use within workplaces. Intuitively, this would seem a good approach, as it will enable a more realistic view of how the system is used than a laboratory study. However, workplace evaluation is liable to be time-consuming and potentially difficult to arrange (Twidale et al., 1994).

The papers discussed in this section are all concerned with studies of existing systems in use in realistic workplace situations. They have generally been undertaken with two intentions, which roughly correspond to the 'general design' and 'specific design' papers covered earlier: to redesign a particular system or to inform the design of further systems. We might broadly think of these purposes as equating to formative and summative evaluations (Scriven, 1967), although it must be stressed that this is only in relation to their purpose and not in relation to the kinds of methods typically used in such evaluations. None of these papers use the kind of evaluations for accountability purposes (to funding bodies and the like) typically associated with summative evaluation (Sommerlad, 1992). Also, some papers fit into the latter category without presenting any more formal an evaluation than a discussion of ethnographic studies performed on a technology in use (e.g. Blomberg, 1986). This categorisation is not carried any further here, as the interesting lessons for future design which come up from the various evaluations discussed here are typically on 'general' subjects rather than relating to a specific system.

An interesting point raised by the very existence of some of the studies considered here (rather than their overt messages) is what constitutes a technology for evaluation within CSCW. Two of the papers which seem clearly to be CSCW studies, and which seem clearly to be evaluative of technology, consider in part or in whole non-computer technology: a portable hardware device, the active badge (Harper, 1992) and a paper documentation system (Andersen, 1994). It seems fairly clear that these are part of CSCW research - in the first case, the paper was published at a CSCW conference, while the second paper is part of a basic research project in CSCW. Both also evaluate technology in use and its effects on organisations, yet they are by no means 'groupware'. The message here perhaps is that CSCW technology need not necessarily be Computer Supported.

The socio-technical dialectic

A common theme of many of the papers here is that there is a dialectical relationship between the social ordering of the workplace and the effect of new technology: the technology alters the culture and work practices of the organisation, but is in term altered by that culture and work practices. Given that CSCW has historically involved the entering of social scientists into an area previously mainly the province of computer scientists and engineers, the main thrust here has necessarily been one of sociologists trying to make it clear that systems cannot be evaluated effectively without consideration of the social ordering of the organisation.

Harper et al. (1991) is a good example of this approach. The first of several papers by these authors and others concerning the London Air Traffic Control Centre (LATCC), it is a study of the use (or rather the lack of use) of an automated air traffic control system. They observe that the controllers made little use of the system at all at busy times, and when they did use it didn't trust it, and conclude this is because the designers of the system failed to take into account the way controllers worked at LATCC. They comment that "insofar as system designers wish to take cooperative working practices seriously, then they will need to know a great deal more about the social organisation of work" (p.232).

Another early paper considering such matters is that by Blomberg (1986) who argues that technology "must be understood in terms of the social environment into which it is introduced" (p.35). Studying the introduction of a photocopier interface design system (Trillium), she notes that its usage patterns are different according to whether the users were working at the company before the introduction of the system or joined after it was introduced: those who were already there frequently used the services of a human 'mediator' between them and the technology, while those who joined post-Trillium seemed to expect that using the system was a normal part of their job. However, she comments that the organisational culture affected the technology also: it was redesigned to allow interface code to be produced directly from Trillium rather than the results being passed to a programmer for coding. This seems to have occurred not so much because of the existing relationships between designers (human factors psychologists) and software engineers, but rather with the intention of changing that balance.

A more recent air-traffic control study than the one discussed above is that by Twidale et al. (1994). They discuss their finding that the acceptability of technology is bound up with the acceptability of work redesign: the introduction of new technology often goes hand-in-hand with the explicit changing of work practices, especially given the current popularity among managers of Business Process Reengineering (Hammer and Champy, 1993). In a not dissimilar comment to the earlier paper, they say that "a 'situated' evaluation would need to address not only the capacity of the 'system' but the flow of work around it" (p.449).

In all these papers (and in the next set discussed), we see an interweaving of the social ordering of an organisation and the technology introduced into it. We have above used the term "socio-technical" to describe this dialectic, explicitly intending to refer to the work of the Tavistock Institute's socio-technical approach, which stressed a similar interweaving. The interesting difference is the directions of travel: the Tavistock researchers, being mostly psychodynamic psychologists, took the significance of the social order as a given and sought to bring the perspective of the technical into the debate, as Trist (1992), the founder of that approach, makes clear in an introduction to a volume of papers on the subject. CSCW, on the other hand, is a dialogue which has involved the introduction of concepts of social ordering into what was previously a technical discipline.

A multiplicity of goodnesses: the importance of roles

A common theme in several of the evaluations studied is the question of by whose criteria a system can be said to be 'good': that is, efficacious for its desired purpose. In particular, there is often a clash of cultures between the designers of a system, especially if they are responsible for user support, and its users.

Star and Ruhleder (1994) illustrate this point well in studying the use of a system to support communication across the Internet by biology researchers studying the genetics of a particular type of worm. They observe that often a user support person will say something like "just throw up X Windows and ftp the file down" to a user, who may not have any idea what is meant by ftp, or how to start up X-Windows. However, they also observe this kind of clash between different types of user with different knowledge not only of computer system but also of types of biology - those within the "worm community" and those outside it.

Similar conclusions are come to by Orlikowski and Gash (1994), who studied the use of Lotus Notes in a large financial services company. They found that the IT department's main aim was simply to get the system running and to keep it running, thinking that the users would work out for themselves how to use it. Users, perhaps unsurprisingly, had somewhat different perceptions!

However, the difference between technologists and ordinary folk is not the only one that affects acceptability of technology. Harper (1992), in a study of active badge use across two research laboratories, found that there was significant different in its use according to the roles of individuals within the organisation: those whose job it was to keep tabs on people (especially receptionists) found them very helpful, while researchers' perceptions were that it was more of an intrusion into their working patterns. This kind of observer-observee dichotomy was also found by Ramage (1994) in a study of the use of a workflow system within a financial services company. Here, the difference was between workers and managers: the former felt that the system, while useful for scheduling work and providing summary information, had a 'big brother' nature to it in that managers could observe how much work they'd done in some detail. The manager, of course, found this useful as a way of making sure her team were meeting their targets.

Harper also notes the importance of organisational culture in acceptance of technology: in one lab he studied, the management issued a decree that the active badge would be used by all staff for a fortnight, and that settled the matter (at least for that period). In the other lab, with a more 'democratic' culture, persuasion was need to get the whole lab using the badge. Of course, dictat is not a way to guarantee introduction of a system: if it doesn't perform as necessary, it will simply be worked around, even in organisations that have imposed the use of the technology.

Some studies observe that technology was not used purely for technical reasons. This was the case in both the studies by Goodman and Abel (1986) and by Tang et al. (1994), where the overheads for starting up a communication in the system were quite large: in the first study because it was necessary to shout to make someone come to the video wall (in a common room), in the second because the system took around 45 seconds to open up a simple 'glance' window on another user's screen. In the Tang et al. study, another technical reason for non-use was that the system was unreliable and often unavailable, so that it was not built into the standard practices of its potential users.

Finally, Ackerman (1994) comments that even in a perfectly useful system, the amount of usage may be fairly low simply because of the nature of the system. His evaluation was of an organisational memory system designed to assist X-Windows users with questions - in many cases it may be quite appropriate, he comments, for users to only use the system once a month but still to find it useful and necessary to have to hand (in the same way as a dictionary might be used only infrequently but when it is used is important).

Conclusions

We have seen here that the main messages of evaluation studies of CSCW systems concern the relationship between the social organisation and the technical system being implemented in that organisation. If systems are to be designed and implemented effectively, it is important that the interdependence of work practice and technology be remembered. Reasons for acceptance of technology - for the system being regarded as efficacious - will also vary from individual to individual depending on a number of different factors. Many of these factors are brought together in a paper by Andersen (1994), who comments that the different factors in articulation work in his study of a low-tech documentation system include the actors, their responsibilities and tasks; but also issues such as information resources, the external work environment, organisational structure and issues of space and time in the distribution of cooperative work.

Of course, evaluations of systems (of all sorts) also have messages specific to the particular system which may well have implications for future design: Lotus may use what Orlikowski and Gash say about Notes to alter parts of that system but so may their competitors; Tang et al.'s study may be applied to other video-conferencing systems as well as their own; Ackerman's study of a particular organisational memory system has much to inform the design of all such systems as well as its general implications for CSCW. In this sense, as Ross et al. (1995) argue, all evaluation is ultimately formative.


References

References

This document forms part of the first-year report for Magnus Ramage's PhD.

Return to the Evaluation of Cooperative Systems home page

Cooperative Systems Engineering Group | Computing Department | Lancaster University
Magnus Ramage 5 January 1996