Computer Based Assessment in Computing Degrees
Interim Report for TQEF funded proposal Nov 01 to July 02
Angie Chandler and Lynne Blair
{angie or lb}@comp.lancs.ac.uk
Computing Department
Lancaster University
Abstract
In light of the steady increase in student numbers and the absence of equivalent staff growth, the question of staff time spent with students versus grading their work has become ever more pertinent. In order to maintain an acceptable level of student contact time, lecturers must reallocate time which may previously have been available for marking of assessments, either decreasing the amount of assessment administered or performing the assessment through other means, such as peer marking or computer based assessment.
Introduction
Computer based assessment, and the means to deal with ever increasing student numbers more efficiently, is now becoming one of the key areas for research in the higher education community. It is now, with both the rise in student numbers and the advancement of technology to a point where it is ready to deal with the burden, that the initial investment of time into computer based assessment, becomes worthwhile. However, before investigating computer based assessment, it is first advantageous to study self and peer assessment of students, an avenue frequently facilitated by computer based assessment when the computer becomes unable to effectively grade responses.
Student self and peer assessment share many of the same positive and negative aspects [Topping 98]. Each of them largely support a formative approach to learning, rather than providing summative evidence of a student’s progress, giving the student an inbuilt excuse to spend more time on a subject without forcing the issue with rigorous exams, or additional coursework, both of which will overburden the teacher [Platt 00] [Zariski 96] [Postle 00]. Instead, the students can take the opportunity to look at what they have learnt, perhaps assess whether they have learnt the topic as thoroughly as they first expected; or in reviewing others work, learn new approaches to a problem whilst reflecting on their own solutions [Thorley 94]. In turn, although students may initially be unhappy about self assessment in particular [Hinett 99], it allows the students to help themselves and help each other to a greater understanding of the subject and the context of the subject in the wider scheme of their learning.
To return to computer based assessment [CAA 02], this growing area has existed in some form since the early seventies, with the introduction of computers as an accessible means of performing repetitive calculations, despite teaching as a whole being reluctant to change to the use of technology as rapidly as other sectors [Underwood 94][Hooper 75], and is currently both aiding in the expansion of distance learning programs [CVU 95] [Looms 02] and assisting teachers in more traditional learning environments. Naturally, to begin with computers were largely concerned with simple tasks, in particular marking of multi-choice questions, and even now they continue to be primarily used in that capacity [Dalziel 98]. However, computers have come a long way in recent years, making even the lowly multi-choice exams more interesting and exciting than it might once have been, and adding a great deal of potential functionality to any assessment designed for the computer [Thelwall 00] [Proctor 96] [Hopkins 98].
Of course, even with the progression of computers to what is available today, there are still certain tasks to which the computer is simply unsuitable. Whilst some free response answers can be analysed by a computer, largely the type of logically based answers expected in science subjects [MathWise 02], there remains a great deal of scope for the students to be required to produce answers which the computer is unable to reliably mark. It is at this level that the advantages of self and peer assessment can be put to use. Suddenly, peer marking can easily be randomised and made anonymous [Ward 00], eliminating any dangers of students marking others down in retaliation, and in turn ensuring that the students feel free to mark as they feel is right, without fear of that same retaliation. Simultaneously, with web based assessments the teacher may also be online and able to edit the example answers based on unexpected responses seen within student’s responses as they happen, possibly aided by a selection of keyword searches on the computer. From this perspective, the arrival of computer based assessment as an accepted means of examining students may revitalise the self and peer assessment process which has been in use for as much as 200 years.
Summarising the possibilities currently available to anyone wishing to implement a form of computer based assessment, there are three major routes to be taken. The first is the common multi-choice questionnaire, which has been widely regarded as a task for computer assessment for a number of years, this is then followed by logical free responses (i.e. mathematical calculations), which are only applicable to scientific subjects, and finally computer based peer assessment, with the only major drawback to any of these methods being the initial implementation outlay and any under disadvantages given to students who are uncomfortable with computers, a serious problem in subjects where computers are not frequently in use. Each of these will be discussed in more detail below.
Assessment through multi-choice questions can be helpful in certain circumstances, but has a tendency to benefit students who merely learn information rather than understand it. For that reason, particularly for use in computer based assessment, a number of extensions have been made to the regular multi-choice. These include the use of multi-choice questions requiring the student to choose more than one answer (multiple response), and, most interestingly, a form of multi-choice which is as close to a free response as possible. The latter option, a form of assessment designed and tested in [Guerts 93] requires there to be a huge range of possible responses, perhaps as many as 40, but will also grant a limited number of marks for answers which are close to the correct one (as defined by the author), in essence very similar to the usual free response questions found in science subjects. These answers will later be converted into an overall score, with the number of guesses as opposed to perfectly correct answers taken into account.
It is at this point, with such a huge range of multi-choice options available, that multi-choice questions begin to mirror regular free response logical questions. This type of question can still be automatically marked by a computer, given certain logical steps for the computer to follow, although some more complex, for instance algebraic, operations will obviously be more difficult to grade than responses which require a numerical answer. Examples of such tests can be found in, for example, mathematics, physics and computer science, where the response can only vary by a small amount and still remain accurate [Ulirchs 98] [Thelwall 00].
The final avenue of self and peer assessment through the computer, as discussed before, is largely the only option open to the less scientific subjects, where right answers cannot be categorically defined, and require a degree of judgement. However, with the continuing advancement of computer technology and levels of artificial intelligence, it remains possible that computers will one day be capable of assessing the results of fully free response answers. Even with today’s technology, free response answers are still a worthwhile approach to take to assessment, and particularly in combination with other approaches where the basic abilities of the students could be graded prior to the selection of a peer reviewer and reviewee for a given paper [Ward 00].
The intention of this discussion of computer based assessment strategies is to reflect and share our findings as we research the possibility of setting up our own computer based assessment program. Initially our intention is to aim the assessment program at first year Java students, a course which is already very highly subscribed (169) and stretches a team of teaching staff to its limits. As a scientifically based subject, we will have access to the full range of possible uses of computer based technology, and will be in a position to easily implement and trial them as our work progresses. Simultaneously, as a computer science department we are in the fortunate position of not needing to put the students’ comfort with computers into consideration. To all intents and purposes, this anxiety which some students may experience when confronted with a computer need not be addressed as it is an accepted element of the course itself [Alderson 02].
Following on from our review of existing work in the field, our initial ideas primarily extend the use of multi-choice questions as an initial ranking of a student, and peer reviewed free response questions [Ward 00], with the lecturer involved able to access and monitor students during the assessment session in order to detect any anomalies which may be inaccurately marked and add them to the sample answers. This basic structure will then be expanded to include simple logical answers, and pieces of Java code, which we hope to be able to introduce and test with the use of a debugger as part of the computer analysis.
However, these basic ideas will ultimately be expanded through further research into the field, not only in the general area of computer based assessment, but also in the form of a survey of the department, both their current practices and their view of an ideal assessment system. With a prototype system created based on these ideas, we may then supplement this with the students’ own views on a computer based assessment method. Only with the final results of this survey data will we be able to fully appreciate what is necessary to implement a computer based assessment for our first year Java students.
Work Undertaken
A survey of the computing department revealed that the majority of assessment was done through either written coursework, similar to that of a short answer paper (Appendix 1), or programming exercises with additional reports, with only relatively few courses requiring full essays from the students. With the overall aim of this project being the reduction of staff hours spent on marking, and improvement of feedback to students on completed work, the natural course of action was to design and implement a prototype online system based on the short answer style work. This would then be applicable to both coursework and exams, making up approximately half of the marking generated by first year computer science students.
With the short answer questions in mind, and a view to make the prototype initially formative only, in the form of past exam papers in order to encourage the students to make use of the system for revision when there was no other incentive, the online question paper was written to appear similar in style to the normal presentation of a coursework answer sheet (Figure 1). This is comparable to the question shown in Appendix 1 in the more usual paper format.

Figure 1 A prototype question
This paper can be accessed as you would access any other web site, with the paper coded in the form of a Java applet. We chose this approach, as opposed to one in JavaScript, as JavaScript would leave the answers more easily visible to the students when viewing the web page source (it is trivial to view a web page source), without the need for them to go through the process of working through the paper itself as intended. This shielding of the answers will also be helpful later as the system progresses into the field of autonomously marked papers when it begins to fully assist members of staff.
As with any Java applet, this program is accessed through a web page interface, which can currently be found at [Chandler 02] see Figure 2. The user must select the paper they require, then from the linked web page the applet will become accessible.

Figure 2 Past Papers Online
The applet is not made available from this initial web page so that any instructions specific to that year’s paper can be added separately. The same applet can then be started on any web page, using the location of the web page itself to determine which paper to run within the applet (Figure 3). It would also be possible to run the applet from the initial web page, and based on feedback from students this may later be chosen to be a better approach.

Figure 3 2001 Paper Online
To run the applet the student must press the “Go” button seen towards the bottom of the screen in Figure 4. A list of each question in the exam will then appear to the left hand side of the screen, showing the marks acquired for each question and a running total at the bottom.
As can be see in Figure 4, each of
the questions is initially signified by a small green button, with a marks
field placed next to it with the user’s current score. The buttons will
only remain green while the question is unopened, in order to provide a
discernable trail should they take the exam in non-numerical order, as
would be possible in a standard paper assessment. For any question which is currently
open, the button associated with that question is coloured yellow. Should
the question then be closed again, the button will return to a green
colour, but be darker than before to signify that it has been looked at. If
the question has been answered, then the button will turn red. As the user proceeds to complete
questions, they will also input scores (or the scores will automatically be
input) within the questions themselves, whilst both their answers and
the actual answers are visible. In essence they must assess themselves
based on the given answer (Figure 7). These scores will be automatically
added to the question bar in Figure 4 and the totals updated. The user may
not alter any scores from within this question bar. This is largely to
discourage the student from cheating themselves into a few extra marks,
“Well, that one was nearly right,” when their overall score turns out lower
than they might have hoped.

Figure 4 Questions
When the user has selected a question, the window for that question will open alongside the question bar, always leaving the bar clear so that they can remain up to date on their progress, and are easily able to open additional questions as they wish to (Figure 5).

Figure 5 The first question
This first question will also include instructions on the detail on filling in the marks fields and use of the “See Answer” button (Figure 5). The marks can only be edited when the question has been attempted and the user has the answer visible on the screen. These instructions will appear in every window until a question is attempted, then automatically be deleted as they begin to type in the answer window.
When the user has completed (or given up on) the question, they may then close the question. If they wish to proceed numerically, then the “Next Question” button will automatically close the current question and open the following one. Otherwise they may close the current question using the usual windows close operator. It is, of course, not essential that the current question is closed before progressing, the button on the question bar associated with the question will keep the user up to date as to the status of that question, whether it is closed but unanswered, complete, or still open.

Figure 6 Several questions open at once
Figure 6 shows the applet with 4 questions currently open. Following the colour scheme mentioned previously, it can be determined from the left hand question bar that question 1 has been opened but not attempted, question 3 has been completed, scoring 2 marks, and questions 4,5,6 and 7 are all open. It is worth noting, that the “Next Question” button at the bottom of each question is also coloured according to the status of the next question, and can be updated whilst the current question is open. In Figure 7 it can be seen that the following question has already been answered.

Figure 7 An answered question
As can be seen in Figure 7 above, this question has been attempted, and on inspection of the answer (seen below the user’s answer box) the user has decided to give themselves 1 mark out of 2. This mark will subsequently be added to the totals on the question bar.
At any time in the proceedings the user may press the red “Exit” button at the base of the question bar. This leads to a given score out of approximately 100, regardless of the questions attempted in the session and is then processed to give an approximate grading.

Figure 8 Exiting
Initial Feedback and Response
Initial reception for this prototype has been highly favourable, with students on the course for which this is aimed being keen to use the tool and largely finding it useful, as can be seen in the following summary of the feedback received.
Reasons for intranet: printable, faster than the online paper
Reasons for online (the java applet): easy access, scoring, answers available, less boring, interactive.
Was there anything difficult or misleading?:
From those who found it easy: marking scheme, check button didn’t work (now removed), hints misleading as wouldn’t be in exam, some answers missing.
From those who found it hard: problems with layout (regarding this we plan to consult with one of the user-interface experts in the department for further advice/guidelines).
Make program downloadable to use in own time and print with answers, include all answers (work in progress), marking guide, links to lecture slides (done – as references to books), allow use to choose number of questions (done), more papers (work in progress).
(this represents the students’ views on summative assessment based on their experiences with this formative assessment prototype)
Reasons from online: more fun that writing, java code easier but some security worries.
Reasons from not sure: system needs improving.
Reasons from paper: online assessments difficult to follow, paper lets you write own notes, not comfortable typing a timed essay online – keyboard, paper is more flexible, easier to convey meaning with diagrams, more reliable and easier to flick through, students could cheat by flipping between windows and chat panels.
Would prefer a standard past paper with solutions, would like program to be downloadable. “Took the time to take a mock exam for once – that must say something!”
As the students are now well into their exams, we don’t expect to see a great many more questionnaires completed at this stage, despite only receiving 13 results out of a year of 169 (1 in 13). However, the questionnaires received have been of notable value, with a number of them providing practical suggestions for improvement of the online exam paper.
One of these suggestions, to enable the students to undertake a portion of the exam instead of the entire exam at one time, has subsequently been implemented, as an entire paper would be expected to take several hours, and feedback suggested that this was unreasonable. Later implementations of our prototype allowed the user to select a subset of the total questions on which to be marked, giving them a choice of start question, number of marks to be available within attempted questions, questions by a certain author, or alternatively marking only being performed on questions which have been opened (Figure 9).

Figure 9 Initial choices

Figure 10 Example options
This greater flexibility will allow students to use the program for revision and obtain a rough grading of their attempts without the need for them to allocate as many successive hours to their revision. However, for students who may wish to sit the exam in a more realistic format, taking time taken and overall marks into account this option is still available. Within the main execution of the program, there is very little difference between the original program structure and its replacement, with the major difference being concentrated in the final results given (Figure 11).

Figure 11 Modified exit window
Here, only questions which the user (at some time) opened were marked as part of the overall result, leaving them with a score of 10 out of 14 rather than 10 out of 99.
Another point which was raised by the feedback given was the value of having the answers in terms of marking their efforts. This is a harder point to address, as simply obtaining question answers in themselves (from the lecturers concerned) was more difficult than had been anticipated. It could also be argued that students could regard decisions about marking as a learning experience.
It is worth noting at this stage that the majority of students would appear to prefer not to do summative assessed work online. Whilst the revision type application, and other formative online assessment exercises might be well received, this is a fact worth taking note of, particularly as it comes from computing students, who are almost certainly the least likely of any group of students to have problems with the use of computers.
As the project entered the evaluation stage, it was necessary to not only encourage feedback from our own students as they make use of the applet, but also to compare and evaluate the work in relation to other programs available and in the terms which other departments within the university might regard it, beyond our own particular needs.
From the outset, the intention of this project was to study the various possibilities currently available to computer based assessment and the self and peer assessment used in conjunction with it, and to create a prototype system capable of extending certain elements of assessment available, primarily for the purpose of allowing students to trial the system and provide feedback. This focussed particularly on ensuring that the online paper maintained the feel of the original paper, rather than being altered beyond recognition, despite the obvious difficulties with automated marking that this naturally posed, and creating a framework from which program based answers could potentially be automatically assessed, thus negating the need for purely multi choice and numerical answer questions.
However, in this final stage, we returned to observe one of the professionally implemented systems to compare the results of the Java applet with that of a tried and tested product. Currently also in use at Lancaster University is the Question Mark Perception tool (version 2) [QuestionMark 02] (This was kindly demonstrated by Alan Shirras of biological sciences), which provides a large range of question types all of which can be automatically marked.
Comparing Question Mark with the Java applet, there are of course marked similarities. In particular a list of questions and their current status can be clearly seen on Question Mark, although it is displayed to the top right of the screen rather than the left, and gives no marks during the assessment as the marking is done exclusively after the completion of the test. The major advantages of Question Mark remain in the automatic marking and in the authoring software, which allows the author to easily choose between types of question and add them to the database, although this is still apparently something staff in general are reluctant to attempt. It also features an extremely helpful feedback system, which, given the appropriate staff input, will provide different feedback depending on the student’s response, e.g. explaining why their answer was wrong.
Despite its advantages, however, there are areas which Question Mark still fails to address, in particular in the general look and feel of a question paper created by the program. The structure of Question Mark is such that the implementation is clearly intended to make a different type of question paper to that which the students would receive as a standard paper assessment, and acknowledges that fact. It is in this area, despite the difficulties it will inevitably cause, that the Java applet is interested. Multi choice questions, single word, numeric and even hotspot answers are difficult to aim at the same type of understanding as freer, more comprehensive answers and this must remain an area of interest.
In particular the checking of programming (and potentially mathematical) answers, which sounds highly plausible, and the use of peer marking, must be considered. A later version of question mark, recently released, appear to have also taken this into account, with a “Java type” question. The exact details of this type remain to be seen, but may be investigated at a later stage, more so given that Question Mark is regarded as the superior computer based assessment system by users currently within the university after some early trials with CASTLE in economics.
Further Work
For our own purposes, given the hugely positive response from staff and students within the computing department, it is also expected that a continuation of this project will be maintained, progressing towards a simple autonomous marking system, which we hope will be in place by week 5 of the winter term for use with the next wave of first year students as the class sizes continue to grow.
The type of autonomous marking we are aiming for is primarily Java code based, and is expected to make use of a compiler and interpretation of compiler results in order to gauge marks. In other words, as a means of assessing the accuracy of code which the students have written, we plan to insert a fragment of their code into a carefully constructed wrapper, and measure the degree of their success by the type of errors the compiler would find if it was to attempt to turn the complete program into executable code. This will potentially test not only code creation but also the student’s ability to understand compiler errors and debug code (a source for much concern). The system will initially, be largely formative, to be used to assist the students in measuring and self-evaluating their progress at this early stage of first year in order to stream tutorial groups on a voluntary basis.
Further to our initial code marking aims, we also plan to implement a degree of peer marking, as seen in the OASYS project, further enabling a decrease in staff time spent marking whilst avoiding the necessity to alter the method of questioning students.
Regarding question types – early in our project, we had discussions with Charles Alderson and Graeme Hughes, Linguistics, regarding their previous TQEF project and the DIALANG project [Alderson 02]. A particularly interesting outcome from this was a demo CD that had been produced by a summer student preceding the DIALANG project – as a “proof of concept”. This contained several interesting alternative question types, which we will revisit and with regards to adding them to our system.
In addition the system must also be extended to automatically allow staff to add new questions, and later to keep a record of students’ marks as they complete the tests. New questions will be added through a web page and updated using php (web page based technology which enables web pages to react to the decisions of the user) once an acceptable format for the form has been established based on the question types we wish to have available to us, something which we must discuss at the very least within our own department.
Useful References
[Alderson 02] Ling131 and DIALANG, Charles Alderson, Linguistics, Lancaster University. 2002. http://www.ling.lancs.ac.uk/staff/charles/charles.htm
[CAA 02] CAA (Computer Assisted Assessment) Centre, 2002. http://www.caacentre.ac.uk/
[Chandler 02] The CSC110 Online Assessment, Angie Chandler and Lynne Blair, Computer Department, Lancaster University. 2002.
http://www.comp.lancs.ac.uk/computing/users/angie/pastpapers/pastpapers.htm
[CVU 95] Clyde Virtual University (CVU) Assessment Engine. 1995. CVU
[Dalziel 98] Using WebMCQ. James Dalziel, Department of Psychology, University of Sydney. 1998. http://www.usyd.edu.au/su/ctl/Synergy/Synergy9/jdalziel.htm
[Guerts 93] An Alternative Way of Using Multiple Choice Questions. Frans Guerts, Agrotechnology and Food Sciences,Wageninen University. 1993. http://www.lboro.ac.uk/service/ltd/flicaa/conf2001/pdfs/o2.pdf
[Hinett 99] Staff Guide to Self and Peer Assessment. Karen Hinett and Judith Thomas (Eds), 1999. Oxford Centre for Staff and Learning Development.
[Hooper 75] Computer Assisted Learning in the UK. Ed Richard Hooper and Ingrid Toye. Council for Educational Technology. 1975.
[Hopkins 98] Web + Qmark + Humanities = ? A Case Study. Chris Hopkins. English, School of Cultural Studies, Sheffield Hallam University. 1998. http://info.ox.ac.uk/ctitext/publish/comtxt/ct16-17/hopkins.html
[Looms 02] Survey of Course and Test Delivery/Management Systems for Distance Learning. Thelma Looms. 2002.
[MathWise 02] The UK Mathematics Courseware Consortium: MathWise. 2002. http://www.bham.ac.uk/mathwise/
[Platt 00] Assessment Strategies and Standards in Sociology: Self and Peer Assessment for Sociology? Jennifer Platt, University of Sussex. 2000.
[Postle 00] Self and Peer Assessment Process for Existing Practioners. Denis Postle, “Leonard Piper” Grop Independent Practioner Network (IPN). 2000.
[Proctor 96] Computer Based Assessment: A Case Study in Geography. Amanda Proctor and Daniel Donoghue, Department of Geography, University of Durham. 1996.
[QuestionMark 02] QuestionMark. 2002. http://www.questionmark.com/uk/home.htm
[Questionnaires 02] Student questionnaire responses to [Chandler 02].
http://htmlgear.lycos.com/guest/control.guest?u=mainssocket19&i=10&a=view
[Thelwall 00] Wolverhampton University Computer Based Assessment Project. M. Thelwall. 2000. http://cba.scit.wlv.ac.uk/home.htm
[Thorley 94] Using Group Based Learning in Higher Education. Ed Lin Thorley and Roy Gregory. 1994.
[Topping 98] Peer-Assisted Learning. Keith Topping and Stewart Ehly (Eds), 1998. Lawrence Erlbaum Associates.
[Ulirchs 98] Computer Based Assessment Using CLASS in Intermediate Physics. Juris Ulirchs and Ramila Amirikas, School of Physics, University of Sydney. 1998. http://www.physics.usyd.edu.au/uniserve/ulrichs.html
[Underwood 94] Computer Based Learning: Potential with Practice. Jean D.M. Underwood. 1994.
[Ward 00] OASYS profile. Ashley Ward, Abhir Bhalerao, Department of Computer Science, University of Warwick. 2000. http://www.dcs.warwick.ac.uk/~ashley/Research/OASYS/
[Zariski 96] Self and Peer Assessment as a Means of Teaching Professional Responsibility. Archie Zariski, School of Law, Murdoch University. E Law – Murdoch University Electronic Journal of Law and presented at “Are there innovative ways to teach professional responsibility?” Sydney, 1996.
Appendix 1 The questions as they would appear on paper (this is in pdf format).
