Double blind marking
Offprint from Teaching Political Philosophy, a journal for lecturers in universities and institutes of higher education.
Published by Blackhell Publishers, Oxford © 2000
2000 vol. 34 no. 1 - Jan. 2000 ISSN 0361-1853
Introduction Fraud detection Double blind marking Double blind marking 2 Discussion References Other titles in this series
Marking papers is an inevitable but time-consuming part of the academic political philosopher's workload. Over the past three years, a research group at the University of Nijmegen has developed an innovative method for marking papers, particularly fit for groups of one hundred and more papers. Double blind marking (henceforth: DBM) promises to be an important breakthrough in the field.
In the first year programme at the Faculty of Policy Sciences of the University of Nijmegen, a general introduction to philosophy is taught for 300 to 500 students. As part of this course, students are required to write five papers, four of which should be the product of teamwork. The fifth is an individually produced paper of approximately five pages, in which the student applies the methods and ideas of an important political philosopher or philosophical school to a current social problem or issue. The grades for papers can theoretically vary from 0 to 10 (excellent); in practice, only 4-9 occur. The passing mark is 6. Experience has shown that grades below 4 discourage students and lead to a decrease in student numbers. A 10 is traditionally reserved for the hypothetical perfect student. After the results have been announced, students get an opportunity to discuss their paper with the professor and/or corrector, following which they have two weeks to file a complaint if they are not satisfied. The professor or corrector is then obliged to re-examine the paper and if necessary adjust the grade.
In 1998, the course was evaluated and several problems relating to the individual paper were identified. It was discovered, first of all, that students did not expect their papers to be corrected by one person only (as happens to be the case). On the basis of this expectation, an average 10 per cent estimated the chances of fraud being detected as low enough to take the risk - a number that does not include students that were not caught. Secondly, correcting these papers within the two weeks required by the exam protocol puts an unhealthy stress on the corrector. In addition, the stress involved and the risk of loss of concentration on the part of the corrector endangered the fairness of the correction procedure as well as that of the resulting marks. A 1998 Faculty policy aiming at stimulating the use of ICT facilities by students added a fourth problem. Since a large number of students now sent in their papers by e-mail, a disproportionate amount of time was lost on printing them so as to be able to compare them to papers that were handed in in the traditional way.
In preparation of measures aimed at detecting and discouraging fraud, an informal analysis was made of the methods used by students to reduce their effective workload. The analysis indicated that traditional methods were increasingly giving way to advanced and far less easily detectable modern methods. Traditional methods included straightforward copying of parts of, or even complete, papers written by other students, complete copying of papers written by students in the previous year, plagiarising by using course material and plagiarism with the help of other easily available material (library books, encyclopaedias, newspapers and journals). A recent but still basically traditional method consisted in copying different sections from different students' papers and adding a brief personal conclusion. Modern fraud methods all make use of the internet, e.g. exchange of papers between different universities, unaccounted quoting from (mostly English) academic lectures and other course material, and plagiarism using non-academic literature on the Web.
The traditional protocol for the detection of fraud demanded that all papers have a similar structure. The corrector then first collected the papers, ordered them by philosopher and subject, and then quickly scanned the texts one by one for signs of fraud. The new methods used by students, as well as the recent obligation to accept papers sent in by e-mail, made the old protocol obsolete. Any ordinary revision of the protocol would, however, imply that still more time and resources would have to be spent on fraud detection. The stress due to time pressure as well as the increased need for continuous concentration would, moreover, threaten the fairness of the procedure.
It was therefore decided to uncouple fraud detection and marking, and to start an experiment in fraud detection. In 2000 and 2001, teaching assistants will be hired to take care of fraud detection and to describe and analyse fraud methods. A report based on their experiences will be prepared and evaluated in 2001, when further decisions on the improvement of the fraud detection protocol will be taken. To stimulate their activities, the teaching assistants will be paid on a profit basis, that is, they will be rewarded for each documented and legally sound case of fraud they detect. To guarantee their safety, their anonymity will be guaranteed.
Even disregarding the time spent on fraud detection, marking 300 to 500 individual papers takes up precious time and puts too much pressure on the corrector to warrant fairness in marking. To deal with these problems, a first version of DBM was introduced in 1999. The DBM procedure consists of two steps: creating a basis for comparison, and grading.
1. Creating a basis for comparison
To create a stable basis of expectation both for students and teacher, we first determined the distribution of marks given to individual papers in the preceding two years (1997-8) (see Table 1 below).
Next, we distributed marks in proportion to their frequency randomly in 66 columns and 124 rows over an A3-size paper (cf. Table 2), this creating the DBM Grading Table.
Grading is performed for each individual paper by taking a pencil in hand, closing both eyes (hence: double blind marking) and selecting a cell from the DBM Grading Table. The paper is then marked accordingly, and the procedure is repeated for the next paper until the last has been marked.
DBM turns out to have several advantages. For one, it improves the quality of teaching. When first implementing DBM, we expected the number of complaints from students about their marks to remain the same as in the preceding years: the average mark and standard deviation remained unchanged, hence as many students were benefited and disadvantaged as before. This expectation was confirmed by the 1999 experience. However, since some complaints had to be recognised as justified, the average mark increased slightly as compared to 1997-8. Since neither course material and course nor the average intelligence of students had changed, this clearly indicated that the quality of teaching itself had increased.
Table 1: Grading distribution in 1997-8
|Grade||% of students w/ this grade|
Table 2: Representative cut-out of the DBM Grading Table
A second advantage of DBM is its procedural fairness (cf. Rawls 1973). It not only offers a stable basis of expectation for students in the form of a fixed Grading Table and a savings principle-like element of beneficence to future generations of students, but also and foremost guarantees fair equality of opportunity. In accordance with Rawls' theory of justice and with currently prevailing academic theories on teaching and education, DBM eliminates two major forms of unfairness often hampering grading systems. For one, it does away with the influence of differences between individuals in natural (in)capacities like intelligence and inborn lethargy, differences that Rawls deemed 'morally arbitrary'. In addition, DBM compensates for the social inequalities that are often attached to or acquired because of these natural inequalities, such as a disinterest in intellectual affairs, the risk of being analphabetic, or different priority schemes as regards study, work, sexual education and carousing.
The 1999 evaluation of the first trial of DBM showed that the system still showed some shortcomings. For one, marking was still performed by hand. Particularly after 30 minutes, the corrector's hand (the corrector right-handed) began to show an undesirable inclination to single out marks from the right centre area of the Grading Table. Although marks are distributed fairly, i.e. randomly, over the table as a whole, the same cannot be guaranteed for every part of it. Hence, the fairness we aimed to secure could not be guaranteed to an acceptable degree. Moreover, the corrector's hand grew tired; an ergonomically undesirable side effect of DBM. Finally, next to the marking process itself, the number of complaints was now recognised as a distinct source of stress and time loss.
To deal with the latter problem, the complaints protocol has been revised. The change was inspired by Immanuel Kant's critique of Roman political ethics, in particular the si fecisti, nega rule (Kant 1919: 42; cf. Lock and De Vree 1992). The corrector, confronted with a complaining student, now immediately looks at the original paper and determines that a second corrector corrected it. In reality, the assistance of another corrector has become superfluous.
Rather than patching up DBM by e.g. changing hands, adding a second corrector or reducing the proportion of 5s in the table (since these give rise to most complaints), we decided to call in the help of colleagues in the Methodology Department. In co-operation with prof. dr. Bert Felling and dr. Theo van der Weegen, we developed a computer program (operational under Windows 1998 and higher) called DBM-2. The programme will be used for the first time during the 2000 round of corrections. Since patent applications for DBM-2 are still pending, we cannot discuss details of the program here. Be it enough to say that the input for the programme consists of student registration numbers of students, that is, only the numbers of those who sent or handed in their papers on time. The program then automatically assigns a mark to each student registration number. The marks are still chosen randomly, but now in proportion to last year's results which, as stated before, are on average a fraction higher than those for 1997-8. In brief, complete procedural fairness in marking is now ensured, and we expect that the average mark will rise annually by a fraction, showing a continuous improvement in teaching quality.
Experiences with DBM-2 and the new complaints protocol will be collected and analysed in the summer of 2000. The results will be reported afterwards in this journal. The program DBM-2 will at that time be updated and if necessary improved; DBM-2.1 is expected to be available for purchase by universities and other institutes of higher learning by early 2001.
In marking individual philosophy papers for groups larger than 100 students, traditional procedures tend to fail to adequately discourage and detect fraud. More importantly, the procedure is time-consuming, it causes stress, and fair marking cannot be guaranteed. Given the constraints of a grading system that excludes grades below 4 and reserves 10 for God (the University in question being a Catholic university) or the professor (most staff being non-Catholics), DBM and privatised fraud detection together seem to offer a propitious perspective for the future. We predict that DBM-2 saves an unprecedented amount of time and meets the highest standards of fairness in both philosophy and the educational sciences.
- Rawls, J. (1973) - A Theory of Justice. Oxford: Oxford University Press.
- Kant, I. (1919) - Zum Ewigen Frieden. Leipzig: Felix Meiner.
- Lock, G.E., and G.K. de Vree (1992) - 'De Bolheid van het Niets', Acta Politica, vol. 27, pp. 35-52.
- Brian Harry Teaching with gust
- Grahame Lock Double blind marking of political philosophy papers for large groups
- Robert H. Lieshout Popper for Dummies: A guide for the perplexed
- Kees van Kersbergen The long and winding road: Teaching political philosophy in political science courses
- Jos de Beus How to confuse students and still impress them
- Phony Giddens Why only sociology matters
- Herman van den Bosch Teaching in native tongue better then English
Available from Blackhell Publishers, Oxford