A r c h i v e d  I n f o r m a t i o n

Assessment of Student Performance April 1997

CHAPTER 4

Part 1

CROSS-CASE ANALYSIS 1:
CHARACTERISTICS OF PERFORMANCE ASSESSMENTS

Introduction to the Cross-Case Analyses

This chapter and the three following it look across the sample of assessments and schools included in this study to address our three overarching research questions:

Our 16 case studies generated rich data that allow us not only to provide preliminary answers to the research questions but also to formulate informed hypotheses about the practicality and usefulness of performance assessments. These hypotheses, in turn, may provide a framework for making policy decisions and for testing the usefulness and practicality of assessment reforms in a variety of educational contexts. In other words, our approach is inductive: through the analyses, we are able to draw some generalizations about the nature and outcomes of assessment reform, but these generalizations themselves must be tested in future research.

The cross-case analysis is divided into four chapters: Characteristics of Performance Assessments (Chapter 4); Facilitators and Barriers in Assessment Reform (Chapter 5); Teacher Appropriation of Performance Assessments (Chapter 6); and Impact of Performance Assessments on Teaching and Learning (Chapter 7). Before we embark upon the analysis, however, we must: (1) summarize our conceptualization of the relationships among performance assessments and assessment systems and identify the elements within the assessment system upon which the analyses focus; and (2) define some terms that are employed throughout the cross-case analyses and the concluding chapter of the report.

A Simple Schemata of Assessments

Throughout the cross-case analyses, we employ several terms associated with assessments: assessment tasks, scoring methods, performance assessments, and performance assessment systems. Exhibit 4-1 illustrates the simple scheme through which we conceptualize performance assessments and assessment systems and organize the analyses. As is illustrated in the exhibit, we conceptualize performance assessments as composed of assessment tasks and scoring methods. Multiple performance assessments may be linked systematically to create a performance assessment system. A performance assessment system, however, may be only one component of a comprehensive assessment system, which would include both a performance assessment system and a variety of non-performance-based assessments, including, for instance, standardized, norm-referenced multiple choice tests.

Therefore, Exhibit 4-1 identifies the two major assessment units considered in our analysis:
The distinctions between performance assessments and performance assessment systems will be explicated in more detail, below, in the section entitled, Format of Performance Assessments and Performance Assessment Systems. For the most part, however, we do not examine performance assessment systems within the context of comprehensive assessment systems.

Conventions of Terminology in the Cross-Case Analyses

Throughout the four chapters of cross-case analysis, we will employ the terms "assessment" and "performance assessment" interchangeably, except in cases where use of the word "assessment" is clearly intended to encompass both performance and non-performance-based assessment techniques. "Assessment system" and "performance assessment system" are used interchangeably in a similar fashion. Finally, in some of the 16 cases included in this analysis, the assessment is, in essence, synonymous with the assessment system, and in those cases, use of the term "assessment system" refers to the assessment itself.

Organization of the Chapter

This chapter:

To accomplish these purposes, we begin by summarizing the features of the 16 performance assessment systems included in this study. We then group and analyze the characteristics of performance assessments according to three broad categories:

Although the purposes, formats, and technical characteristics of performance assessments are considered separately, it is important to recognize that they are interrelated and that the relationships among the three features determine the usefulness and value of assessments and of assessment systems within different educational contexts.

Summary of Performance Assessment System Characteristics

Exhibit 4-2 contains a summary of each of the performance assessment systems included in this study. The exhibit presents an overview of the factual information pertaining to the assessments illustrated in the case study summaries presented in Chapter 3. It highlights the following features of the performance assessments and performance assessment systems:

Because the characteristics of assessments in practice are greatly influenced by the level of educational authority at which they were initiated ? state department of education, district office, or school ? we structure the presentation of the key characteristics of performance assessments according to the level of initiation of the assessments.

Purposes of Performance Assessment Systems and Subject Areas Assessed

The sampled performance assessments and performance assessment systems are used for multiple purposes and in several subject areas. The purposes, to a large extent, depend upon the performance assessment system's level of initiation, the degree to which it is integrated into the educational system, and, in some cases, its stage of development.

Exhibit 4-3 tabulates the stated purposes of the assessments and assessment systems included in this study according to the level of initiation of the assessment or system. The purposes were articulated by the educators involved in the development and implementation of the assessments or assessment systems.

Note that multiple purposes are stated for some of the assessments and assessment systems. For example, the purposes of Kentucky's performance assessment system include informing and influencing instruction, monitoring student progress, and school accountability. These stated purposes are not mutually exclusive. For example, accountability assumes that student achievement can be adequately judged and monitored through the use of the performance assessment system. Furthermore, in practice, influencing instruction and aligning instructional practices with curriculum are not completely distinct functions, as both entail integrating assessment and instruction within the same pedagogical framework.

Most educational systems in our sample view informing and influencing instruction and curriculum as one of the most important purposes of performance assessments. In most cases, the initiators and supporters of performance assessments claim that the assessments influence and inform instruction, as they:

Such explanations of the value of performance assessments as a tool to influence instruction and curriculum resonate with the assumptions that underlie the assessment reform movement.

Because, as has been noted, the purposes of assessments and assessment systems often vary according to the level of initiation of the assessment, we now turn to discuss the purposes underlying assessment reform at each level of initiation, as experienced by educators and assessment reformers involved with the 16 performance assessment systems included in this study.

Purposes of National Level Assessments

The three national-level efforts to develop (or to help develop) performance assessments included in this study are the Coalition of Essential Schools (CES), the New Standards Project (NSP), and the College Board's Pacesetter Mathematics Program (Pacesetter). The purposes of the first two networks are to induce changes in curriculum and instruction by helping participant schools develop their own performance assessment systems. Hence, based on the assessment philosophy and guidelines of these two organizations, participant schools are developing their own performance assessments and performance assessment systems. By contrast, the purpose of the College Board's Pacesetter Mathematics Program is more specific: Pacesetter Math is designed to provide a rigorous course in high school mathematics for students who have completed coursework in Algebra 2. The College Board calls the math program a capstone course in high school mathematics. Also, though Pacesetter programs may be appropriate for use with a wide range of students, the College Board emphasizes using the programs to enroll minority and disadvantaged students in academically challenging courses. Districts and schools participating in the program implement the Pacesetter's mathematics course that integrates curriculum, instruction, and assessment.

The methods the three organizations employ to leverage changes in assessment systems and the support they provide to their member schools (and member districts and states as well) are quite different. CES provides general guidelines, research information, and many examples of the kinds of assessments schools may use to bring about desired changes in educational structures and pedagogy. On the other hand, NSP and Pacesetter provide specific guidelines for designing and using scoring rubrics and assessment tasks, and they organize assessment scoring activities and professional development sessions for their participants. In the case of Pacesetter, these activities clearly affect teachers' assessment practices, but the emphasis on assessment is subordinate to the overall program. NSP, however, has broader goals in terms of influencing assessment reform than either CES or Pacesetter. Because one of NSP's stated objectives is to develop a standard, national assessment system, NSP also is placing a premium on the technical properties, such as reliability and validity, of assessments and on systematic professional development.

Reflective of the differences in the three organizations' approaches to assessment development, the four schools in our study that participate in the national reform efforts develop and use different types of assessments. Cooper Middle School (the CES participant) has a unique assessment system, tailored completely to its own environment. Cooper teachers devise their own assessment tasks and scoring rubrics, with little, if any, standardization. In contrast, the two NSP member schools have participated in pilots of the NSP assessments and have used some of the NSP scoring rubrics and content area guidelines to develop their own assessments. Teachers at the Pacesetter participant school have tailored their in-class assessments to complement the Pacesetter Program, both using the program's units and developing their own assessments.

Purposes of State Level Assessments

Six of the 16 performance assessment systems included in our study are implemented at the state level. The stated purposes and uses of state-initiated performance assessments reflect, at least in part, the stage of development of the assessment systems.

Maryland's and Kentucky's performance assessment systems, the two most advanced state-initiated systems, are the most ambitious in their stated purposes. These two performance assessment systems are intended to monitor student progress toward the attainment of state-articulated outcomes and to serve system accountability purposes. Similarly, Arizona's performance assessment system was introduced several years ago and was being used to monitor student performance. The Arizona Department of Education had intended to use the system for both accountability and certification purposes; however, the system was suspended in 1995 due to technical difficulties.

Although Vermont is fully implementing its portfolio system, school accountability is not one of the system's stated purposes. Nonetheless, schools are encouraged to report results of the assessments to the public, which, in effect, functions as a public accountability mechanism, even though no rewards or sanctions are associated with the assessment results.

The other two state-initiated systems ? New York's and Oregon's ? are still in the research and development phase. In many respects, both systems also can be classified as local-level efforts, as much of the development work is progressing at individual schools throughout the states. New York has granted waivers to local high schools to develop performance assessments as partial fulfillment of the Regents Examination. New York's initiative is a small-scale development project that receives little support from the state level, and not all New York high schools are required to participate in developing the system. It is, at the same time, high stakes, as students' performance on Regents waiver course performance assessments are combined with their scores on the standard Regents Examinations. (Students' scores on Regents Examinations are used for awarding the Regents Diploma; they also are used by New York colleges to make admissions decisions.)

The Oregon Department of Education, in contrast, is supporting the development of a performance assessment system through grants to school districts across the state.1 The assessment system, in its developmental stage, is low stakes. In the long run, though, Oregon plans to use the fully-articulated assessment system for student certification and system accountability purposes. However, the specification of the assessment system ? how performance assessments will be combined with standardized, multiple-choice tests ? remains under debate, making adherence to the original timeline for the institutionalization of the assessment system tenuous.

It is important to note that these six state-initiated or state-supported performance assessment systems are at radically different stages of development and implementation, and no system is fully formulated. The purposes of these assessments are evolving as the assessment systems develop and change ? and as data concerning their usefulness and effectiveness become available.

In all, though, these six state-level sample cases support Herman's (1992) assertion that states continue to put their faith in assessments as the model of accountability. States find traditional multiple-choice assessments problematic, not the general model of accountability.

Purposes of District Level Assessments

Three district-initiated performance assessment systems are included in our study. The stated purposes of these three systems illustrate the reasons districts may have for introducing performance assessments. As is illustrated above in Exhibit 4-3 educators at these three districts identified four of the five purposes of assessment reform. Harrison School District 2's Performance-Based Literacy Assessments are intended to monitor student performance and to hold schools accountable for student achievement. South Brunswick's Sixth Grade Research Performance Assessment is intended to help align instruction with curriculum. Finally, the purposes of Prince William County's Applications Assessments are to monitor student progress and to influence instruction. The district plans to use the assessments for accountability purposes. As with the sampled state assessment systems, district-level systems defy comparison because of their differential stages of development.

Purposes of School Level Assessments

In our sample, school-level systems are the most closely connected to the idea of using assessments to inform instructional practices and to diagnose student strengths and weaknesses. In all cases, teachers tailor the homegrown assessments to the educational needs of their particular students in their particular schools. Thus, school-level systems are typically not static, stand-alone aspects of education, but are integrated into the school's pedagogical practices. Accordingly, they constantly evolve and change, based upon teachers' evaluations of the assessments' pedagogical utility.

However, it must be noted that four out of the seven school-level assessments included in this study are being developed and implemented at the elementary school level.2 One high school implementing a performance assessment system has obtained waivers from district-wide testing requirements. The one middle school implementing its own performance assessments has not received waivers. As a result, the middle school is experiencing difficulties in maintaining a balance between traditional pedagogical methods and performance-based methods. The students at this school are required to take the Iowa Test of Basic Skills (ITBS), and, thus, teachers feel obliged to prepare them for this test using traditional teaching strategies.

The pattern in our data suggests the possibility that elementary schools may experience the least difficulty in instituting performance assessment systems, as they are less likely to operate in environments that require teachers to teach and assess in traditional ways. It is possible that middle schools and high schools experience more external testing pressures, limiting those schools' capacity to develop and implement innovative assessment systems, even for pedagogical purposes.

Subject Areas Assessed

The subject areas most frequently targeted for assessment are language arts, including reading and writing, and mathematics. Social studies and science also are included in some performance assessment systems included in our study.

Exhibit 4-4 shows the frequency of subject areas assessed by assessments and performance assessment systems initiated at different levels of educational authority.

In language arts, the content area most targeted for assessment, the assessment emphasis is typically on different genres of writing. While previous definitions of "good" writing did not assume purpose or audience (Mitchell, 1992), advocates of reform in the teaching and assessment of writing believe that "good writing" is not good under all circumstances. That is, there is not one template for good writing, but many, as the ability to write does not automatically transfer from one genre to another; different circumstances require different styles and forms of writing.

Following this logic, assessment reform in writing is based upon the ideas that proficiencies in different types of writing skills must be developed through teaching and assessing different genres of writing and that writing must be tailored to particular purposes and particular audiences. These genres can include, but are not limited to: reporting information; reviewing, evaluating or critiquing books, plays, and events; writing autobiographical incidents; explaining or developing solutions to a problem; speculating about cause and effect; business writing; reflective writing; and creative writing (after Mitchell, 1992).

An example of one performance assessment that focuses upon several genres of writing is Kentucky's language arts portfolios. For their language arts portfolios, Kentucky's 8th-grade students are required to write:

Harrison School District 2 provides another example. The district's performance-based writing assessments are keyed to the significant student outcome, "Students will communicate in writing to multiple audiences for the purposes of informing, persuading, organizing, and providing enjoyment."3

Assessment reform in mathematics is based upon the assumption that the evaluation of mathematical knowledge must go beyond the assessment of memorized algorithms and mathematical facts to include the evaluation of students': (1) understanding of mathematical concepts and ability to apply such concepts to complex problems; (2) proficiency in mathematical communication; (3) acceptance of different solutions to the same mathematical problem; (4) acceptance of different methods of arriving at the same conclusion; and (5) knowledge of and the ability to explain the processes used to arrive at mathematical solutions. (The National Council of Teachers of Mathematics guidelines, which reflect these principles, are used widely to inform curriculum and assessment reform in mathematics.) Assessment reform in mathematics is attempting to frame mathematics as a tool for solving problems, not as an abstract subject reserved for academically advanced students. For example, for their mathematics portfolios, Vermont's 8th graders must include three categories of mathematics problems (puzzles, investigations, and applications), and they must describe the decision-making processes they used in solving the problems.

Finally, the focus of assessment reform in other subject areas similarly aims to shift the emphasis from an exclusive evaluation of students' knowledge of facts to include an assessment of knowledge and understanding of procedures and methods.


1The Oregon assessment plan was substantially changed in the summer of 1995. The information presented in this chapter covers academic years 1993-1994 and 1994-1995.

2 Note that the four schools participating in national-level assessment reform efforts are included in some this discussion as well (as, indeed, they are through much of this report). The seven schools are generally discussed together because the national reform efforts typically require extensive local-level effort and do not dictate the form performance assessments take at the school level. However, the assessments developed and implemented by Sommerville High teachers are so closely aligned with the Pacesetter syllabus and guidelines that, in many cases, we do not discuss Sommerville's assessments separately from those of Pacesetter. National reform participant schools are separated from the other school-level efforts only when there is a clear distinction to be made between those school-level efforts that are guided by national efforts and those that are not (for example, the professional development opportunities provided teachers vary according to national reform effort participation or non-participation).

3 Harrison School District 2: Significant Student Outcomes.


-###-


[Chapter 3: Case Study Summaries Part 5 of 5]  [Contents]  [Chapter 4: Cross-Case Analysis 1: Part 2 of 4]