Make font smaller  Make font larger

Winter 2005

The Assessment agenda

The grading master: a simpler way

Where do outcomes statements end and assessment criteria begin? What is the real purpose of your assessment task? Where does the line fall between formative and summative assessment? GABRIELLE MATTERS comes up with answers to some big questions en route to proposing an alternative basis for criteria/standards-based assessment.

TEACHERS ARE ASSESSING every day of the week. They are making decisions about what and how to teach. They are making judgements about the nature of student learning and the quality of student work. It has always been thus. But now there is an emphasis on teacher-devised tasks and authentic assessment.

There is also increasing acceptance of the view that outcomes statements in KLA syllabuses are not of themselves assessment criteria. Ainley (2004) notes, ‘in much of Australia there is little assessment below the senior years that provides teachers with information about standards that they might expect from their students’. Parallel to teacher-devised authentic assessment, then, is the need for standards-based assessment.

It does not follow, however, that assessment per se is complicated.

A teacher’s assessment life must be complicated (not)

What teachers need is a simple structure for expressing assessment criteria and performance standards. This article concludes with a description of a grading master, a variant of the traditional criteria/standards matrix, as a model for grading student performance. It has been applied in a system that has teachers judging the standard attained by students on multifaceted tasks, but it could also be applied to more structured assessments in place of conventional marking schemes or rubrics.

This approach does not value any one purpose of assessment over another.

Purpose is everything (not)

You will hear people comparing and contrasting formative and summative assessment until you become so paralysed with fear about the real purpose of your assessment task that you don’t even begin the process of designing one, much less working out how to grade it.

Formative assessment occurs when assessment, whether formal (eg testing), or informal (eg classroom questioning), is primarily intended for, and instrumental in, helping a student attain a higher level of performance. Formative assessment occurs prior to summative assessment; its purpose is partly to guide future learning for the student.

Summative assessment occurs when assessment is designed to indicate the achievement status or level of performance attained by a student at the end of a course of study or period of time. It is geared towards reporting or certification.

There is no necessary distinction between formative and summative assessment in their content or conditions. Assessment results may be used for a variety of purposes, the most productive being the promotion of further learning. It does not necessarily follow, however, that we should turn our backs on accountability.

Effective classroom assessment improves teaching and learning. Standardised assessment tasks improve the comparability of results reported to parents, and the system, about the quality of the learning that has occurred, and under what conditions the evidence of the learning was gathered. Effective classroom assessment can deliver comparability if appropriate standards assurance processes are in place; and standardised assessment can improve teaching and learning if teachers and administrators take advantage of the information that the data have to offer.

Assessment is a lever (so what?)

The hallmark of effective assessment is that it illuminates learning that is (or has been) occurring, and promotes further learning. Assessment alone may be unable to enrich the learning experience; nonetheless, if carefully constructed and applied, it may well support and reinforce enrichment in powerful and positive ways.

Assessment always has backwash effects on the curriculum; some would even say that assessment drives the curriculum. The discussion to be had is about the nature and extent of the curriculum backwash effects, not about their existence. Recent research at the Queensland Department of Education reinforces the extent to which assessment against common standards applied in the same way across schools ensures that students are actually exposed to the intended curriculum, and that teachers and students receive powerful messages about what really counts.

Because we simply do not have the option of denying the centrality of assessment, we need to ensure that it is well thought-out, comprehensive, comprehensible, reliable and valid— all of which add up to ‘good’—but it can never be absent.

A sample grading master

Grading masters allow assessors to describe their judgements as a vicinity, or sub-range, of the continuum, which can be positioned between standards descriptors or even across grade demarcations (which in grading masters themselves no longer need be sharp). This means that assessors are not forced to consign a piece of work to a cell when their judgement is that no cell descriptor adequately matches the work. Similarly, the standards descriptors in grading masters are themselves presented as vicinities, which gives due recognition to the fact that the best-written generalised standards, especially of high-quality performance, can never be sharply defined nor communicated with absolute precision. This allows assessors, once they have understood the broad intents of the descriptors, to focus more on the quality of the work they are assessing than the precise meaning of the standards descriptors to which student work must be matched.

figure1

There is only one format for a criteria/standards approach (not)

Nearly 20 years after his seminal paper on specifying and promulgating achievement standards, Sadler (2003) recently pointed out particular dangers in this modern era of standards writing—atomism, matrices and arm-chairing. Instances of these dangers are, respectively, fine-grained outcomes statements; criteria/standards schema with lots of cells containing superfluous words; and wise people in leather chairs sitting back and deciding what should matter—that is, be rewarded in student work, either at the task level or at the overarching level for reporting results.

The new grading master model contains none of the above. It does, however, contain features necessary to support the nature of complex, multifaceted tasks that assess multiple knowledges, understandings, skills and dispositions. Other assessment models, such as impression marking, or models using detailed rubrics, analytic marking and mechanical combination rules, would be counterproductive in that their application would tend to reduce the multidimensionality of complex tasks.

Criteria/standards schema

The traditional criteria/standards matrix serves to fulfil the broad intents of a criteria/standards approach (provided the distinctions drawn are not just pass/fail or competent/not competent); for example, these matrices help ensure that assessment is criteria-based rather than norm-based, and that standards for the various grades are explicit and transparent. In the absence of viable alternative devices, the use of the criteria/standards matrix has become virtually synonymous with the taking of a criteria-based approach.

In designing and using traditional criteria/standards matrices, assessors have, however, had to grapple with the often untoward implications of certain covert assumptions built into the matrix format itself (or fostered when assessors apply that format), but which are not foundational to a criteria/standards approach. Two examples follow.

One: The format of the traditional matrix requires that the number of significant and discernible differences used in judging quality be the same for all criteria. This can result in assessors expending effort on ‘manufacturing’ distinctions in quality where real distinctions do not exist, thus obfuscating standards, biasing grades and making discussion of standards more difficult.

Two: Traditional formats require that the ‘quantum’ of achievement between adjacent standards descriptors is also the same, or thereabouts. Not only must assessors write standards descriptors for the required number of distinctions, but also they risk biasing results if their standards descriptors do not have this quantum property.

In summary, the simplicity of the matrix format can disguise real difficulties and complexities in its design and use.

Grading master

An alternative device, known as a grading master, uses a format that appears more complex than a matrix, at least at first, so that its design and use will be much simpler. Grading masters aim to remove the unnecessary, and often counterproductive, assumptions of the matrix format, while still maintaining the roles matrices play in fulfilling the broad intents of a criteria/standards approach. By positioning standards descriptors, and then assessors’ judgements, along a continuum (‘pole’), rather than having them allocated to discrete cells, grading masters allow the number of standards to vary from criterion to criterion, as can their relative placements.

Teacher–assessors plot a student’s performance on each of the criteria on the corresponding pole before making an on-balance judgement to arrive at the overall grade. It is possible, also, to give feedback to students and report on performance on each pole separately.

References

Ainley, J (2004). Evaluation of the New Basics Research Program, ACER, Melbourne.

Sadler, D R (2003). Address to Education Queensland’s Assessment and Reporting Framework Implementation Committee, Brisbane.

author picture Gabrielle Matters is director, Assessment & New Basics, Qld Dept of Education and the Arts.

top