Evaluation methods in the CSCW literature
Much of the text of this document was originally written for a paper by Ross, Ramage and Rogers (in press). In that paper we concluded that no single method was adequate to evaluate the richness of the CSCW situation, and so argued for a multiplicity of methods, theories and perspectives.
There is also a document on evaluation studies in the CSCW literature.
The diverse influences feeding into CSCW provide a wide range of existing evaluation methods for CSCW practitioners to use and adapt. We describe below those methods which have actually been used, to the best of our knowledge, and discuss briefy their suitability in a collaborative context. Some of the methods used have come to CSCW directly from their 'home' discipline, others have come via HCI: the contributing discipline has been indicated in brackets.
- Heuristic evaluation. (HCI) Heuristic Evaluation (Nielsen, 1993) relies on an
evaluator's immediate reactions, intuitions and predictions, categorised under a set of
Design Principles and Usability Attributes. These define the desirable properties of a
usable interface, and typically include: consistency; feedback; user control; user's
model; clarifying metaphors (Principles); learnability; memorability; error recovery;
efficiency; and subjective satisfaction (attributes) (Nielsen, 1993). These can be used as
an instrinsic part of a Heuristic evaluation, or as a useful framework for categorising
interface characteristics after any evaluative method. For CSCW application, additional
issues such as awareness of other users, focus, coordination, ownership and
communication must be considered - although results become increasingly sketchy
given the complex group interactions of collaborative work. To an extent, heuristic
evaluation is an inevitable part of any system design process, as designers do
something and then try to figure out if they like it. It is seldom mentioned explicitly in
the literature, but can be seen in trials of systems by their designers, for example in Haake and Wilson (1992).
- User testing. (HCI) Much advocated in HCI (Tognazzini, 1992), user testing
generally takes the form of studies conducted by system designers with real users in a
semi-realistic use context. The aim is to see how the system is used and what usability
or functionality issues arise - typically qualitative data are collected, to feed back into the
design process. Resnick (1992) describes an example of user testing of a phone-based
system, which used "field trials" of several prototypes (such as event guides) in
academic and public settings to determine how the system was used. User testing is a
useful part of the iterative design process, but care needs to be given to questions of context, as argued by Bannon (1991).
- Lab experiments. (Cognitive/Social Psychology) Laboratory experiments are quite
widely used to evaluate CSCW systems (e.g. Ishii et al., 1993; Wan and Johnson ,
1994; Olson and Olson, 1991). These are used to collect quantitative data about a single
specific factor, attempting to screen out other influences. However, as with user
testing, there are significant problems with the decontextualised and artificial nature of
these experiments. They tend to assume the kind of disembodied intelligences criticised
by Suchman (1987) and others, not helpful given the situated nature of CSCW
systems. However, they are useful when applied to very precisely-defined questions as
part of a wider study. Twidale et al. (1994) discuss their applicability in the early stages
of an evaluation, suggesting they can be used with easily available users (students) to
throw up the most glaring usability problems, allowing the more complicated issues to
be dealt with by the real users (air traffic controllers), who may only be available for short periods.
- Interviews & Questionnaires, Focus Groups and Customer Feedback. (Social Psychology) Various methods involving direct user reactions can be used to obtain various qualitative data about users' experiences with systems (either immediately or a little while after use). They have been used particularly as a way to capture data prior to further analysis (Beck & Bellotti, 1993) and to improve a commercial product by collecting customer feedback (Abbott & Sarin, 1994). Their subjectivity (in that direct user opinions are being collected) makes them useful, but also limited (although this can be guarded against by using a large group of people and by wording questions so they contain various 'consistency checks').
- Longtitudinal trials and Semi-realistic ethnography. (Sociology) These methods - referred to by Bellotti as "in the zoo" (quoted by Olson & Olson, 1991) - lie somewhere between the unsituated lab experiment and the messy, real-world ethnographic study. They often involve having one's colleagues (or a similar accessible, controllable group) use a system for a prolonged period of time, before it is tried out on real users (eg. Bellotti & Sellen, 1993; Goldberg et al., 1992). Such studies can suffer from being rather inward-looking, in that they end up focussing on their own research teams, and as Harper (1992:36) comments, research labs are "peculiar fish bowls" due to "the forms of working relationships one finds therein". However, such methods are often highly instructive in practice, given some degree of care as to their wider applicability.
- Ethnography. (Sociology) The most realistic way of evaluating a system is to go into the place of work and watch real users using it over a prolonged period. Data collected include audio and video-tapes of work practices, field notes as to the most significant practices carried out by the participants, descriptions and diagrams of the work setting, and samples of various artefacts (such as documents) which illustrate the nature of work in the organisation. This approach has been used on its own to inform systems design (Bentley et al, 1992) or as a way of providing data for further analysis using distributed cognition (Rogers, 1994), activity theory (Kuuti & Arvonen, 1992), social psychology (Star & Ruhleder, 1994) and other methods. Traditionally, ethnography requires a long period of immersion - months or even years - in the study setting before the ethnographer can perform an informed analysis (not often practical in a systems design project). However, as Hughes et al. (1994) discuss, methods such as "quick and dirty ethnography" (a brief study, typically a few days, with specific questions in mind as to the nature of the work) can still provide useful amounts of data in a shorter time.
- Conversation Analysis and Interaction Analysis. (Ethnomethodology) These methods study real group interactions as revealed by their (directly recorded) conversation and actions (Woofitt, 1991; Suchman & Trigg, 1991; Luff et al., 1992). The aim is that of ethnomethodology: to study the users' categories directly, rather than imposing a theoretical framework. They focus on the detailed features of interaction (at various levels), either on conversations alone or on interactions between people and between people and technology. However the undoubted usefulness of such methods in CSCW evaluation is offset by their very detailedness, which results in masses of transcript and/or video-tape to be analysed.
- Breakdown Analysis. (Computer Science/Philosophy) A breakdown is defined as any incident where the user has cause to focus on the system rather than the task (Winograd & Flores, 1986). Breakdown analysis studies group interactions and conversation transcripts to highlight such breakdowns. This is a useful method for identifying key problems associated with user-system (or user-user) communication (Urquijo et al., 1993). However, the focus is necessarily restricted, disregarding many other interesting aspects of collaborative work, such as the distribution of roles and power amongst the group members. Like many of the other methods above, it might be usefully used in conjunction with others.
References
This document forms part of the first-year report for Magnus Ramage's PhD.
Return to the Evaluation of Cooperative Systems home page
Magnus Ramage and Susi Ross 5 January 1996