Assessment of Student Performance April 1997
Where the previous chapter focused primarily on facilitators and barriers in assessment reform outside the classroom, this chapter looks at facilitators and barriers in assessment reform at the classroom level. The ultimate objective of any educational reform is to improve student outcomes. Based upon the findings and inferences we can draw from our data, we suggest that: (1) for learning to improve, instruction must change; (2) for meaningful instructional change to result from assessment reform, teachers must "appropriate" assessment technologies; and (3) teacher appropriation of assessment technologies is fostered or impeded by a variety of factors, including:
Exhibit 6-1 illustrates the posited relationship among student learning, pedagogical change, teacher appropriation of assessment technologies, and facilitators and barriers in that appropriation.
As will be clear from the analysis below, each of these factors can serve either as a facilitator of or a barrier to reform, depending upon its presence or absence and its manifestation. Furthermore, as variation in these factors often parallels the level of initiation of an assessment, the level of initiation can be used as an organizing factor in analyzing the effects of these facilitators and barriers. For example, teachers' involvement in developing and implementing state-initiated performance assessments tends to be more limited than it is in district- and school-initiated assessments. In addition, the extent to which assessments are tightly or loosely prescribed also tends to follow the level of initiation, with states introducing more tightly prescribed systems than districts and schools. Where assessment system characteristics tend to vary according to the level of initiation, the discussion will be organized in terms of state-, district-, and school-level assessments.
Finally, it should be reiterated that this discussion of teachers' appropriation of assessments is based upon the information collected at the 16 schools participating in this study. Findings concerning teachers' appropriation of state- and district-initiated assessments cannot be generalized to teachers outside the participant schools. However, we have identified patterns of teachers' experiences with state-, district-, and school-initiated assessments across the sites.
Teacher Appropriation Defined
In this chapter, we investigate and discuss the impact of several characteristics of performance assessments on the extent to which teachers appropriate an assessment technology in their classrooms. By appropriation we mean a host of uses teachers may make of and attitudes they may hold toward performance assessment technologies. To summarize, these uses and attitudes include:
Exhibit 6-2 provides definitions and examples of what, for the purposes of this paper, we will designate high, moderate, and low levels of teacher appropriation of performance assessment technologies.
Level of Initiation and Mediating Factors (Facilitators and Barriers)
Manifestations of facilitators and barriers in teacher appropriation of assessment technologies can, to a great extent, be predicted by the level of initiation of an assessment. In other words, the level of initiation of an assessment is closely associated with the extent to which teachers appropriate assessments. However, this relationship is more satisfactorily explicated by examining the effects of the five factors under consideration, which themselves tend to vary with the level of initiation of an assessment as well.Exhibit 6-3 illustrates the relationship between teacher appropriation and the level of initiation of the assessment. In the balance of this chapter, we explore each of these five factors as they facilitate or impede teacher appropriation at our 16 sites.
The first factor that influences teacher appropriation of an assessment is the extent to which teachers are involved in the design and implementation of the assessment system. Teacher involvement can be present or absent at many stages of assessment development and implementation, including: (1) the design of assessments and assessment tasks; (2) the development of scoring rubrics and the establishment of standards of performance; and (3) participation in scoring the assessments. Exhibit 6-4 operationally defines high, moderate, and low levels of involvement in assessment design, standard setting, and scoring and summarizes the extent of teacher involvement in each of these three areas as experienced at each of the 16 participating schools. Exhibit 6-5 summarizes the coincidence of the levels of teacher involvement and appropriation. Examples of how the extent of teacher involvement in assessment design and implementation can affect teacher appropriation follow.
Design of Assessment Tasks
The level of initiation of an assessment to a great degree determines the extent to which teachers are involved in the development of assessment tasks. School-initiated assessments, almost by definition, involve teachers in the development of assessment tasks. What is more, all or most of a school's teachers, not just a few, will be involved in the development process. Teacher appropriation of assessment tasks they develop, use, and score is clearly quite high. For example, teachers at Ni?os Bonitos, Cooper, and Thoreau all clearly demonstrated their appropriation of their respective assessments. These teachers repeatedly stressed, in particular, their belief in the high instructional and content validity of the assessments they have developed for use with their students. Furthermore, teachers who have developed assessments for use at their schools testify to the value the development process has for teachers. Teachers at Noakes, Ni?os Bonitos, and Cooper, for example, all stated that the development process has helped them reflect upon and articulate the types of skills they want to foster in their students and to identify and understand the evidence necessary to demonstrate students' acquisition of those skills.
Involving teachers in the development of assessment tasks is a more difficult endeavor for state departments of education designing new assessment systems. Even in states that have included teachers in the development process, only a small fraction of teachers are involved. Thus, teachers are less familiar from the beginning with assessments initiated at the state, and even district, level. Occasionally, this isolation from the development process can induce a markedly negative response from teachers. For example, one teacher commented about the Arizona Student Assessment Program, "I don't understand why they went to the so-called 'experts.' We're the experts. They should have come to the teachers and asked us to develop it."
In contrast, there are some examples of state assessment programs that foster teacher development of assessment tasks. In New York, for instance, the state has allowed schools to petition to replace portions of the Board of Regents examinations with teacher developed assessments. Teachers at Hudson High School have developed alternative performance assessments for use in English, social studies, earth science, and biology courses. Students who choose to enroll in these classes take these assessments as a waiver for up to 35 percent of the Regents examination in the given subject area. Teacher appropriation of the assessments in this instance is quite high, as evidenced by teachers' voluntary contribution of hours and hours of their own time to what one of them has called their "labor of love."
In between these two cases of high and low teacher involvement in the design of state-level assessments are the experiences of teachers at Breckenridge in Kentucky and Maple Leaf in Vermont. Teachers at these two schools were not involved in either the decision to introduce performance assessments statewide or the process of developing the structure of the assessment systems. (Though both state departments of education did include some teachers in the development process, no teachers at Breckenridge or Maple Leaf were involved.) However, within the two states' portfolio assessment systems as developed, teachers are required to develop assessment tasks that adhere to specified criteria. This involvement in specifying assessment tasks has encouraged teachers to work actively with the states' assessments, resulting in a fairly high level of appropriation of the portfolios required by the systems.
Development of Scoring Rubrics and Establishment of Performance Standards
Closely related to the development of assessment tasks is the establishment of scoring criteria for those tasks (e.g., scoring rubrics) and standards for student performance on assessments. Teacher involvement in setting those criteria and standards can be an important facilitator of appropriation of the assessment. Teachers who are involved in developing scoring rubrics and setting performance standards are more likely, in general, to have confidence that the skills being assessed and the standards to which students are being held are developmentally appropriate. This confidence, in turn, fosters appropriation of the assessment technology.
Teacher involvement in developing scoring rubrics and performance standards at the 16 participant schools falls into three categories. Within the first two of the three categories, the criteria and standards vary in terms of their uniformity (that is, whether the criteria and standards are applied uniformly to all students' efforts or are allowed to vary across students). The three categories are:
In addition, in the second two cases when teacher involvement is only moderate or low the clarity of the criteria or standards is sometimes low from the teachers' perspective, thereby inhibiting teacher appropriation. The effects on teacher appropriation of (1) teacher involvement in the establishment of scoring criteria and performance standards, (2) the uniformity or flexibility of application of criteria and standards across students, and (3) the clarity of criteria and standards are explored below.
Teachers are Highly Involved in the Development of Scoring Criteria and Performance Standards
In cases in which teachers are involved in the development of scoring criteria or performance standards, teachers have demonstrated fairly high levels of appropriation of the assessments. This finding holds for both cases in which criteria and standards are left to teacher judgment and cases in which criteria and standards are applied uniformly to all students' efforts.
Teachers who work with assessments without uniform scoring criteria or performance standards are, by definition, involved in the setting of standards for student performance. For example, at Thoreau, all teachers are involved in the senior-year portfolio and performance event process. Teachers not only have the freedom to design the details of the assessment as they desire, but, what is more, they openly acknowledge (and students know this as well) that their standards of performance are not uniform across students: teachers expect better performances from their better students, and they hold them to the standards they believe the students are capable of achieving.
In cases, too, where teachers have developed a set of scoring criteria or performance standards to be applied to all students, appropriation of the assessment technologies remains high. For example, at Ni?os Bonitos Elementary, scoring rubrics developed by some of the school's teachers are used to assess students' English literacy growth; every child within a homogeneously grouped language arts class is assessed according to the same rubric. Teachers at Ni?os Bonitos are confident of the effectiveness of the approach to teaching and assessing English language literacy they have developed, and all teachers in the school use the system. Similarly, all 6th-grade teachers in the South Brunswick Unified School District meet multiple times during each school year to review the district's Sixth Grade Research Performance Assessment scoring rubric, which sets both scoring criteria and standards for the assessment. Though teachers do not score their students' performances themselves, they are centrally involved in developing the standards of performance. These teachers also give testimony to their appropriation of the district's assessment technology, noting that they have introduced their students to many similar research and demonstration assessments over the course of the school year.
Teachers are Moderately Involved in Establishing Scoring Criteria and Performance Standards
Teachers at several schools (Noakes, Ann Chester, Westgate, Sommerville, and Crandall) are, in our terms, "moderately" involved in developing scoring criteria and performance standards. In terms of the level of involvement in developing these features of assessment systems, these teachers have typically been involved in developing some part, but not all, of the scoring criteria and performance standards, or have been involved in setting criteria or standards for some assessment tasks but not others within the assessment system. Though teacher appropriation of performance assessments runs from moderate to high at four of these five schools (the exception being Westgate), being further removed from the process of setting scoring criteria and performance standards has served as a barrier to appropriation in at least two of these cases.
Pacesetter teachers at Sommerville are responsible for developing their own criteria and standards for their in-class use of the Pacesetter program, but they are not involved in developing either aspect of the College Board's uniform end-of-the-year performance assessment, administered to all Pacesetter students. These teachers expressed reservations about the appropriateness of some of the standards and the applicability of some of the task-specific rubrics used to score the assessments.
Teachers at Crandall High School, as part of the development work they agreed to undertake for Oregon's performance assessment system, also reviewed and refined the generic scoring rubric that was to be applicable for all math and science tasks developed by teachers throughout the state. Teachers found this task confusing, primarily because the rubric was divorced from actual tasks and, more importantly, from content standards and curriculum frameworks, which remained to be developed by the state. This incompatibility of criteria and standards made it hard for teachers to work with the assessment system as a whole and, consequently, to appropriate it.
For teachers at Noakes and Ann Chester Elementary Schools who used New Standards Project assessment tasks and rubrics with their students (among other assessments), the lack of involvement in development of scoring criteria did not serve as a barrier to appropriation of assessments. This finding is probably attributable to the fact that teachers at those schools chose voluntarily to become involved with NSP. Furthermore, interpretation of the NSP rubrics remained at the discretion of the individual teacher at both schools, fostering appropriation of the assessment (both schools, however, aim to achieve common understandings across teachers of what evidence establishes accomplishment of what performance level, but teachers are actively involved in developing that common understanding).
Finally, some teachers at Westgate Middle School were involved in the establishment of performance standards for the district's Applications Assessments, though not in the development of scoring criteria for assessment tasks. However, this involvement did not remedy teachers' perceptions that some of the assessment tasks and scoring criteria expected levels of performance from students that are developmentally inappropriate. For example, a 7th-grade math teacher said that she planned to teach her students how to work with percents earlier in the year because the topic had been emphasized on the previous year's 7th-grade mathematics assessment. She did not, however, believe it was appropriate to teach her students these new concepts before she had laid a foundation for them, and yet she felt compelled to do so against her better judgment.
Teachers Are Not Involved in Establishing Scoring Criteria and Performance Standards
In several schools working with state-initiated performance assessments, teachers have not been involved in the development of scoring criteria or the establishment of performance standards, and this lack of involvement tends to serve as a barrier to teacher appropriation of performance assessments.
In the case of teachers at Breckenridge Middle School in Kentucky, teachers believe some of the performance standards to be developmentally inappropriate. As with Westgate teachers who felt the same way about their district's assessment, this skepticism serves as a barrier to teachers' appropriation of the assessment.
In one instance, teachers at Manzanita High School in Arizona were unaware if performance standards exist for the assessment. (Though teachers are familiar with the types of scoring rubrics used, they assert that no overall standards for performance on the ASAP's performance assessment exist.) In this case, teachers expressed skepticism about the value of the assessment and the assessment reform effort. For example, one teacher commented about ASAP, "How can we take this assessment seriously if we haven't even decided yet what a passing score is?" Quite clearly, teachers who find themselves unable to take an assessment seriously cannot possibly be said to have appropriated it.
Participation in Scoring of Assessment Tasks
In general, teachers who are regularly involved in scoring assessments have more opportunity, and perhaps propensity (because in some cases participation in scoring is voluntary), to appropriate assessment mechanisms than do teachers who are not. For example, at Maple Leaf and Breckenridge in Vermont and Kentucky, all teachers of students in the grade levels assessed are involved in scoring portfolios (and at Breckenridge, other teachers help score portfolios as well e.g., 6th- and 7th-grade math teachers help the 8th-grade math teachers score math portfolios). Regardless of teachers' positive and negative feelings toward other aspects of the assessment systems in those states, teachers comment that participating in scoring not only helps them to understand the assessment system better, but also helps them understand better how to adapt what they do in their classrooms to support and reflect the assessment system.
Similarly, the College Board's Pacesetter mathematics culminating assessment is scored during a single scoring session attended by all teachers participating in the program. Pacesetter teachers at Sommerville High School asserted that their participation in these reading sessions gave them a much better sense of the objectives of the assessment and of how the rubrics were applied to assessment tasks.
In other instances, however, teachers are less involved in the scoring process, and have typically not appropriated the assessment technology. For example, only a few teachers at Walters and Manzanita in Maryland and Arizona have been involved in scoring assessments. Similarly, in Prince William County, Virginia, all assessments are scored outside of the district by a private corporation. Teachers involved with these three assessments feel removed from the scoring process and from the appropriation of the assessment technology participation might foster.
Summary
To some extent the level of teacher involvement in developing and implementing assessment systems is little more than a reflection of the level of initiation of the assessment. Necessarily, states and districts cannot include all teachers in these endeavors. However, insofar as a goal of assessment reform is to effect instructional change, it is clear that a low level of involvement in the development and implementation processes does impede the appropriation that will ultimately lead to changes in teaching practice. From comments of teachers at several schools working with state- and district-initiated assessments (e.g., teachers at Breckenridge, Maple Leaf, and Windermere), a hypothesis can be supported that the converse is also true: teachers who are involved in some part of the development (Windermere) or implementation (Maple Leaf and Breckenridge) process gain knowledge about the assessment, grapple with the nuances of learning involved, and, subsequently, begin to appropriate the new technology.
Differences across performance assessments in the level of prescription that is, how loosely or tightly prescribed assessment systems are can have a dramatic impact on teacher appropriation of assessments. In general, the more tightly prescribed an assessment system, the less likely teachers are to appropriate it, and vice versa. To summarize the discussion in Chapter 4 of characteristics of high, moderate, and low levels of prescription:
In general, data collected from the 16 schools participating in this study suggest that the more tightly prescribed the procedures, the less ability teachers have to appropriate the assessment for use in their classrooms in ways that make sense to them pedagogically. Conversely, assessments that are more loosely prescribed allow teachers the room to adapt assessment tools for use in their classrooms and to integrate the assessments into their regular teaching practices.
Descriptions and examples of tightly, moderately, and loosely prescribed assessments appear in Chapter 4. In this section we revisit those examples to examine the effects of level of prescription of assessments upon teacher appropriation of the assessment. In general, we find that assessments developed at the state-level tend to be more tightly prescribed and have lower levels of teacher appropriation of assessment technologies, while assessments developed at the school level are more loosely prescribed and have higher levels of teacher appropriation (however, several states have developed, or, in some cases, fostered the development of, what we call "moderately prescribed" assessments). A summary of the data presented in Exhibit 4-11 illustrating how the 16 sites fall out along the level of prescription continuum appears in Exhibit 6-6, and Exhibit 6-7 illustrates the relationships between levels of prescription and appropriation at each of the 16 participant schools.
Effects of a Tightly Prescribed Assessment on Teacher Appropriation
In Chapter 4, the Maryland School Performance Assessment Program was described as an example of an assessment with tightly prescribed assessment format and implementation procedures. Teachers in the assessment grades have very little control over assessment tasks, implementation, or scoring procedures; too, assessment administration occurs during a single, discrete time period determined outside the classroom. What is more, assessment results are made public so long after administration as to be only marginally useful to teachers as they review their own teaching over the long run.
This high level of task and procedural prescription (combined with a low level of teacher involvement in designing the assessment system) has contributed to limited teacher appropriation of the assessment among teachers at Walters Middle School. Because they do not develop tasks, score student responses, or even see results in a timely manner, they do not utilize the assessment technology in any regular way. Teachers' most frequently shared comment about the state's assessment was their complaint about the excessive amount of time required to plan and set up the assessment's experiment component; furthermore, these teachers suggest that the assessment's impact on their classrooms is, at most, marginal.
Effects of a Moderately Prescribed Assessment on Teacher Appropriation
As described in Chapter 4, Vermont's portfolio assessment provides a contrast to Maryland's, serving as an example of a moderately prescribed assessment. In the Vermont system, a structure exists, but teachers have the flexibility to design assessment tasks for use in their classrooms, and they score the portfolio assessments by applying state-established rubrics to their students' work. Also, in general, the timelines for completing assessment tasks and scoring the assessment are determined by teachers.
Teacher appropriation of the assessment technology among teachers at Maple Leaf Middle School is more far-reaching than in the case of Walters, illustrated above. Although teachers have criticisms of the portfolio system (e.g., it places burdens disproportionately on language arts teachers and, to a lesser extent, math teachers; and the system's validity and reliability are uncertain), they also profess to have learned from the system. One Maple Leaf math teacher said he had gained insights into the learning processes of one child with limited math skills. A language arts teacher said that, in response to what she has learned from the portfolio technique, she now places greater emphasis on the thinking and editing aspects of the writing process than on mere mechanical reporting. She also mentioned that by addressing the concept of voice in writing she has gained insights into her students' learning; in her words, "It [the portfolio system] has made me a better teacher." However, perhaps the highest testimony to teacher appropriation of the state's portfolio assessment is that teachers not required to work with it are doing so voluntarily. At Maple Leaf, the 7th-grade math teacher had begun to use a portfolio assessment after the state began to require an 8th-grade mathematics portfolio, and other teachers at the school are "experimenting" with this assessment technique as well.
Effects of a Loosely Prescribed Assessment on Teacher Appropriation
Teachers at Park Elementary who use the Primary Learning Record (PLeR) have appropriated the assessment technique to a high degree. As was described in Chapter 4, these teachers use the PLeR voluntarily, and each teacher decides for himself or herself how to use it most beneficially with students. In a sense, teacher appropriation of this loosely prescribed assessment is absolute: teachers choose voluntarily to use it, and they determine themselves how they will use it.
-###-
[Chapter 5: Cross-Case Analysis2: Part 2 of 2]
[Chapter 6: Cross-Case Analysis 3: Part 2 of 3]