Make font smaller  Make font larger

Winter 2005

The Assessment agenda

Modelling super assessment

The diagnostic information obtained from common-scale assessments is enhanced by using criterion-referenced marking. JANELLE HO analyses one school’s results from the 2004 Australasian Schools Writing Competition to demonstrate the power and flexibility of this assessment model.

IN MOST SCHOOLS, TEACHERS SET THEIR OWN ASSESSMENTS and marking schemes/scales for their classes. These assessments also tend to be marked by norm-referencing. This means that a student’s achievement is assessed in terms of their performance relative to others in their class, year level or age group.

While teacher/class-based assessment allows teachers to cater to the specific needs of their classes, the results cannot be used for comparative analysis with students who did not sit that particular assessment. However, using a common task and a common marking scale allows the results of two students, year levels or schools to be compared directly. Valid diagnostic information can therefore be obtained to inform whole school teaching and learning programs.

The quality of information obtained from common-scale assessment is further enhanced when criterion-referenced marking is used. This means that each piece of work is marked according to each and every criterion listed and described. Every student, regardless of age and ability, is marked according to these criteria.

The Australasian Schools Writing Competition is a common-scale assessment which uses criterion-referenced marking. The year 7 to year 10 results of one school which participated in the 2004 Australasian Schools Writing Competition are used to exemplify the power of this assessment model.

Advantages of a common marking scale

The 2004 Australasian Schools Writing Competition assessed recount in the form of a newspaper article. The mean scores in each criterion for each year level are shown in Table 1.

A closer look at the information provided in Table 1 is enlightening. In 12 criteria, the year 8 students did not perform as well as the year 7s. This raises several questions: Is this dip in performance evident across other KLAs/subject areas? Would interdisciplinary work, study strategies or home partnerships help? Is the dip confined to this specific text type? Had insufficient time been set aside to teach that text type? How can resources be re allocated to help this year level improve?

All is not doom and gloom though. The results also indicate a significant improvement from year 8 to year 10. This is particularly obvious in the criteria related to text type, such as genre, introduction, affective language and descriptive language. The same questions asked previously can therefore be asked in converse to understand the improvements. Can the improvement be attributed to better programming, better teaching and learning strategies or increased familiarity with the text type? How can the school’s curriculum programming be modified to incorporate these positive elements?

figure1

Focusing on specific criteria

Focusing on specific criteria highlights another interesting trend—there is little improvement across year levels in some criteria such as mode, clause pattern, prepositions and articles/plurals. It is probably not desirable for more than 5% of students to score 0 in mode, which looks at students’ ability to maintain language and features appropriate to a written text. Are students using inappropriate features, such as text messages or spoken-like constructions, in their writing?

In clause pattern, the slight decline in scores in the upper year levels may indicate that students are attempting more truncated sentences or more complex constructions but are not succeeding. It should be noted that excessive use of truncated sentences in a newspaper article is inappropriate. Have students done so? A high percentage of ESL students can affect the score for prepositions and articles/plurals. The way prepositions are used is changing rapidly and ESL students find this grammatical element especially difficult. Similarly, the distinction between the two indefinite articles and between the indefinite and definite articles may be unclear. The results seem to indicate that these grammatical elements require specific teaching across all year levels.

Looking at individual student scores

The common marking scale also enables any two students’ results to be compared. Using the sample results in Table 2, if a year 8 student achieved a score of 35, it is possible to say that not only is his performance better than average for his year level, he has also performed better than the average year 9 and 10 student. Diagnostically, this information may be used to better group students for more focused teaching and learning activities.

figure2

Constructing a common task

Now we have seen the power of a common task marked on a common scale, we can turn our attention to how to construct common tasks and common scales.

In constructing a common task, it is important to bear in mind the differing interests of various age groups. Obviously, what interests a year 4 student is unlikely to interest a year 8 student. To reduce the number of tasks that have to be set, identical tasks may be set across some year levels or within a key stage. What teachers need to decide on is a genre/text type that would be suitable as an assessment task (as opposed to a teaching task). While the stimulus material may be different due to differing student interests, each task has to enable students to demonstrate the qualities of the genre/text type that will be assessed. Instructions and task wording should be kept as similar as possible, although vocabulary and complexity of sentence structure may be adjusted to suit the age group. Such differences should be minimal.

Constructing a common marking scale

The first step in constructing a common marking scale is for teachers to read examples of differing quality of the text type to be assessed, ranging from student samples to works by professional writers. These examples will give information about the text structure and the type of descriptive, literary and/or technical language that is used. For example, would affective language, technical language or figurative language need to be assessed separately? What about persuasive or rhetorical devices? Should formatting/layout or register/mode be assessed? The examples will also guide teachers on the grammatical and syntactical features to assess. Teachers might wish to assess certain features no matter what the text type is; for example, sentence structure and clause pattern. But if the text type usually leads to work written in the present tense, then tense may not need to be assessed. Similarly, if the text is usually written in the past tense, then agreement may not need to be assessed.

After deciding on the criteria, the range of scores in each criterion needs to be fixed. Although it may seem like a good idea to have as many scores as possible, issues such as marker fatigue, marker ability to distinguish between scores and marking time need to be considered. What is important is to ensure that even the weakest students can ‘get on the scale’ and that the best students are sufficiently recognised and rewarded.

Marker training

Marker training is important in order to achieve consistency. If the school is small enough, it may be possible for the team of teachers who constructed the task to also mark, thus ensuring a high level of reliability.

Before ‘live marking’ begins, markers should have some scripts for practice. These scripts may be examples used in test construction or photocopies of actual student tests. The scores are discussed and judgements can be fine-tuned before marking begins. Sporadic collaborative marking or check-marking should confirm that markers are maintaining consistency.

Conclusion

Setting a common assessment task with a common marking scale can be daunting initially; however, practice makes perfect and eventually a bank of resources is created for future use. Vitally, the power of this form of assessment, when used in combination with what teachers already do in their classrooms, can be used to inform curriculum programming for the whole school.

author picture Janelle Ho is the assessment officer for Writing and Spelling in Educational Assessment Australia, UNSW.

top