A r c h i v e d  I n f o r m a t i o n

Assessment of Student Performance April 1997

EXECUTIVE SUMMARY

The use of performance assessments ? assessments that are non-multiple-choice ? is not an entirely new strategy in American education. Essays, oral presentations, research projects, and other kinds of open-ended assessments always have been features of successful classrooms. What is new in the current assessment reform movement, however, is the emphasis on the use of performance assessments to support systematic, state-, district-, or school-wide purposes such as guiding changes in instruction and curriculum, monitoring student achievement toward desired outcomes, holding schools accountable for student achievement, and certifying student capabilities. Undergirding these purposes of assessment reform is the basic assumption that the use of performance assessments will result in better teaching and learning processes and in enhanced student achievement.

The purpose of this U.S. Department of Education, Office of Educational Research and Improvement-sponsored project, Studies of Education Reform: Assessment of Student Performance, was to elucidate the nature and effects of the assessment reform movement taking place across the country.

Study Objectives and Design

The three specific objectives of this study were to:

To meet these objectives, we employed a qualitative, case-study methodology. We visited 16 schools that were, in some way, involved in developing and implementing performance assessments. At each school, we interviewed school personnel, teachers, students, parents, and school board members. Site visits were conducted during the spring of 1994 and spring of 1995. We visited 7 of our 16 schools twice.

Performance assessments at the 16 school sites included assessments initiated at all levels of educational authority ? state, district, and school ? as well as those supported by the work of national level education reform efforts. We attempted to obtain a sample that would be broadly representative of: the level of initiation of the assessment reform effort; the type of assessment task (on-demand tasks, extended projects, demonstrations); subject areas assessed; and status of implementation (e.g., developmental, pilot, full-scale implementation). We also attempted to include in our sample schools reflecting different levels of schooling (e.g., elementary, middle, or high school) and different geographical locations. Exhibit I identifies the schools and assessments included in the study.

Characteristics of Performance Assessments

Findings concerning the characteristics of performance assessments fall into three broad categories:

The Purposes of Performance Assessments

Assessment reformers identify five purposes that performance assessment systems are intended to serve, and many assessment systems included in our study are intended to serve multiple purposes simultaneously. These five purposes include:

These purposes are not mutually exclusive, and any one performance assessment system may be intended to target several purposes at once. Indeed, many assessment systems included in this study were intended to serve multiple purposes. However, our study indicates that various purposes are not necessarily compatible with each other (at least not in all combinations), and emphasis upon one purpose can sometimes result in the neglect, or abandonment, of another ? at least in the short run.

The Format of Performance Assessments

Performance assessments vary tremendously in the forms they take. Indeed, the only characteristics shared by the performance assessments included in this study are the pedagogical assumptions upon which they are based and the fact that they require students, in some fashion or another, to construct responses to tasks.

The format of performance assessments can have important ramifications for the power of an assessment system to bring about meaningful changes in teaching and learning. Drawing upon our sample of assessments, we built a taxonomic scheme that delineates the relationships among the components of performance assessments and illuminates their characteristics:

Exhibit II presents this conceptual overview of performance assessments.

Assessment Tasks

The first component of a performance assessment is the task that the student must attempt to complete. Five types of tasks emerge from our sample:

This set of assessment tasks demonstrates that "performance assessment" means different things to different people. The one feature these different types of assessment tasks have in common is the requirement that students actively construct responses to problems or prompts.

Scoring Methods

A performance assessment comprises both an assessment task and a scoring method used to judge the quality of student performance on the task. These scoring methods, scoring rubrics as they are sometimes called, are a pivotal feature of assessment reform. They both specify the knowledge and competencies for which student work is to be evaluated and delineate the criteria for determining the quality of student work. Through the combination of tasks and scoring methods, states, districts, and schools have attempted to articulate and communicate the skills and competencies that are important to teach and to assess.

Four broad types of scoring or evaluation methods are evidenced in the performance assessments included in this study:

The influencing of instructional practices to date has been served most powerfully by generic rubrics. In requiring teachers to design and use assessment tasks that elicit the skills and competency outcomes articulated in the generic scoring rubrics, states, including Kentucky and Vermont, have communicated and promoted the educational outcomes valued by the state.

Performance Assessments

Together, the assessment task and the scoring method comprise the performance assessment. Performance assessments in our sample vary in terms of two dimensions that have implications for their pedagogical usefulness:

Performance Assessment Systems

Taken together, the combination of a task and a scoring method forms a performance assessment. A performance assessment system, in turn, consists of several (though in some cases only one) performance assessments that are assembled and administered to serve one or more specific, system-wide educational purposes. Associated with the performance assessment system is a set of administration and scoring procedures. Together with their administration and scoring procedures, performance assessment systems can be classified along two major dimensions:

Our data suggest that performance assessment systems initiated at the state level for the purpose of system accountability tend to be quite tightly prescribed. In contrast, assessments developed at the school level for pedagogical purposes tend to be more loosely prescribed (though several states have developed, or, in some cases, fostered the development of, what can be considered "moderately prescribed" performance assessment systems).

Our data support a strong hypothesis: if performance assessment systems are moderately prescribed ? that is, they provide a structure for implementing the system within a coherent educational framework and involve teachers in developing and implementing the assessments ? the purpose of informing and influencing instruction is more likely to be achieved, at least in the short run. Our findings further indicate that performance assessment systems that cast wide pedagogical nets ? that involve teachers and students on an on-going basis ? also are more likely to achieve the purpose of informing and influencing instruction.

Technical Properties of Performance Assessments

Experts in the field of testing are actively debating the nature of the different technical features performance assessments must possess to fulfill their purposes. The technical features enumerated for performance assessments include not only the traditional criteria of content validity and reliability, but also consequential validity, equity, and generalizability. Furthermore, the notion of content validity criteria has been expanded to include content quality of assessment tasks and meaningfulness of the tasks to students.

It is beyond the scope of this study to comment on the technical robustness of the performance assessment systems that comprise our sample. However, procedures to determine and ensure the technical robustness of the assessment systems in our sample have been established in some cases and not in others, depending upon the status of implementation (e.g., development, pilot, implementation), the level of initiation, and the purposes of the performance assessment system.

Facilitators and Barriers in Assessment Reform

Many factors can serve as facilitators or barriers in the assessment reform process. An analysis of facilitators and barriers, however, is complicated because of the diverse, and occasionally incompatible, purposes of assessment reform. Factors that facilitate the achievement of one purpose may serve as a barrier to the achievement of a second, equally desirable purpose. For example, a high degree of standardization and technical perfection may facilitate the gathering of reliable student data for monitoring student progress, but the rigidity of the system may serve as a barrier to adapting the system to inform and guide everyday instruction. Moreover, the potency of each factor is affected by the potency of other factors.

Nonetheless, facilitators and barriers in assessment reform with respect to the various purposes of the reform can be identified by the level of initiation of a performance assessment system: state, district, and school.

Facilitators and Barriers: State-Initiated Assessments

Six factors emerge from our findings as either facilitators (in their presence) or barriers (in their absence) in assessment reform at the state level.

Utilization of Outside Sources of Information

Outside sources of information and expert help have facilitated the development and implementation of performance assessment systems. These outside sources include work conducted by the National Council of Teachers of Mathematics, the American Association for the Advancement of Science, and private testing and measurement corporations and consultants.

Perceived or Actual Soundness of the Assessment System

Several state-initiated performance assessment systems have been hampered because of perceived (and actual) technical problems with the assessments. Indeed, at the time of this writing, one of the six state-initiated performance assessment systems reviewed in this study, the performance assessment component of the Arizona Student Assessment Program, has been suspended indefinitely, and another, Oregon's system, is due to be modified significantly from its originally conceived form and purposes.

Coordination with Associated Reforms

Ensuring the compatibility of assessment reform with other related reforms ? in particular, the revisions of curriculum frameworks and the establishment of content and performance standards ? can serve as a potential facilitator in assessment reform. At this point in time, coordinated efforts to introduce performance assessment systems and curriculum guidelines have been most successful in those states in which the efforts clearly reinforce each other and are visible at the local level.

Public Perceptions of the Fairness of the Assessment

The fairness ? actual or perceived ? of the assessment system plays out at two levels, the school level and the student level. The fairness of accountability mechanisms in high stakes assessment systems is crucial to the successful introduction of those mechanisms. The fairness of the assessment and its consequences to a wide range of students also can affect responses of teachers, parents, students, and other stakeholders to the assessment.

The Adequacy of the Timeline and the Politicization of the Reform

The amount of time allowed for developing, piloting, introducing, and institutionalizing a performance assessment system can have a significant impact upon a state's ability to sustain its reform efforts and to meet its various objectives. The pressure to produce results is typically intensified when the introduction of the assessment system takes place in the political realm (i.e., through an act of the legislature). Furthermore, changing political climates can threaten the reform process.

Professional Development Opportunities for Teachers

Professional development clearly is a critical component of a state's efforts to introduce assessment reform. Our findings indicate that teachers' understanding of the assessment ? its purposes, format, pedagogical underpinnings, scoring procedures, and consequences ? and their ability to work with the assessment are crucial to progress toward attaining the state's purposes for the assessment system.

Facilitators and Barriers: School-Initiated Assessments1

Three factors can serve as facilitators or barriers in a school's ability to achieve its purposes for assessment reform.

Waivers from District Testing or Reporting Requirements

Waivers can serve as a facilitator of assessment reform by freeing schools from external mandates that are either incompatible with the reform or that compete too much with the reform for limited teacher time.

Availability of Information and Resources

Schools stand to benefit from information and resources available through their participation in national-level education reform efforts (i.e., the New Standards Project, the Coalition of Essential Schools, and the College Board's Pacesetter program). For example, teachers at four schools in our sample have participated in the conferences, symposia, and institutes offered by the NSP and Pacesetter and consequently have developed a deeper understanding of performance assessments.

Availability of Time and Existence of Supporting Organizational Structures

Finding time out of busy school schedules to design and implement performance assessments remains a sizable barrier to assessment reform at the school level. Schools have attempted to surmount this barrier in a variety of ways, legislating time into the school day or week to support assessment and other reforms.

Facilitators and Barriers: District-Initiated Assessments

Our sample of district-level assessment initiatives indicates that the kinds of facilitators and barriers districts experience can be similar to those that states face, though on a smaller scale, or similar to those that schools face, but on a larger scale. In large part, the existence or absence of various facilitators and barriers is reflective of the district's purposes in introducing assessment reform.

Teacher Appropriation of Performance Assessments

The mere introduction of an assessment itself is insufficient to drive changes in pedagogy and instructional practices. Our findings indicate that, if changes in teaching and learning are to take place, teachers must use an assessment, adapting it as necessary for use in the classroom and, when appropriate, integrating it with other teaching techniques. Teachers also must value the information the assessment generates about their students' performance and about the effectiveness of their instructional strategies. In short, we find that teachers must appropriate the performance assessment if meaningful changes in teaching and learning are to occur. Five factors can exert a significant influence upon the extent to which teachers appropriate performance assessments for use in their classrooms.

Teachers' Involvement in Developing and Implementing the Assessment System

The extent to which teachers are involved in developing and implementing the assessment system influences their appropriation of the assessment. The process of developing assessment tasks, scoring rubrics, and performance standards requires the developers to think carefully about the types of skills to be assessed, the types of tasks best suited to assessing children's attainment of those skills, the elements that distinguish one level of performance from another, and the standards of performance for those skills to which children should be held. Thus, teachers who are involved in developing these aspects of assessment systems are more likely to appropriate the resulting assessment systems than are teachers who do not take part in the development process. Similarly, teachers who have an active role in implementing an assessment also have an opportunity to think carefully about the skills being assessed and the elements that demonstrate different levels of performance.

The Level of Prescription of an Assessment System

A factor related to teacher involvement is the level of prescription of the assessment system. In general, the more tightly prescribed an assessment system, the less likely teachers will appropriate it, and the more loosely prescribed an assessment system, the more likely they will appropriate it. Teachers who are expected to exercise their discretion over task specification and implementation and scoring procedures necessarily acquire experience in constructing and using performance assessments, and, consequently, they are less likely to find themselves at odds with the assessment tasks, scoring procedures, and performance standards used with their students.

The Professional Development Opportunities Available to Teachers

The type and extent of professional development opportunities provided to teachers affects their ability to work with assessments. Three models of professional development were employed at the 16 sites included in this study: train-the-trainer, conferences and institutes, and collaboration with outside experts. Regardless of the professional development model used, the professional development opportunities that focus upon teachers' capacity to develop and work with performance assessments in general, not just with a specific assessment system, have the most positive impacts upon teachers' likeliness of appropriating performance assessment systems.

Interactions Between Assessment Reform, Other Elements of Education Reform, and Other Assessment Requirements

The interaction between assessment reform and other elements of education reform or other assessment requirements can serve as a facilitator or barrier in teacher appropriation of new assessment techniques. Notably, when teachers observe a disconnect between two elements of reform (such as assessment reform and revisions to curriculum frameworks) or between performance assessments and other assessment requirements, they typically will adopt a "wait and see" stance before they invest time and effort in working to appropriate the new assessment system.

School Organizational Factors

School-level factors, most particularly the introduction of regular time for teachers to work with assessment methods and strong individual or group leadership, can serve to support teachers' appropriation of performance assessments. The provision of regularly scheduled time for teachers to develop assessment tasks and to share experiences with colleagues both allows teachers time to conduct these activities and communicates to teachers that the time they spend outside the classroom improving their knowledge of pedagogy is valuable. Similarly, strong leadership from either the school or the district and from either an individual or a group can provide teachers with a vision of both where assessment reform may be headed and how it may be achieved.

Impact on Teaching and Learning

The notion of "teacher appropriation" summarized above is not, in and of itself, a desired outcome of implementing performance assessment systems. Rather, we see teacher appropriation of performance assessments as a necessary prerequisite to obtaining the meaningful changes in pedagogy and instructional strategies sought by education reform. These changes, in turn, are prerequisites of the ultimate objective of any education reform: improved student learning. Our findings reveal some preliminary evidence of the impact performance assessments are having on teaching and learning.

Impact on Teaching

Some changes in teaching practice seem to be resulting from the introduction of performance assessment systems. In particular, the use of portfolios and extended performance tasks are contributing to the following changes in curriculum and instruction. However, it is difficult to assess the quality of these changes.

Changes in Curriculum

Teachers implementing portfolio assessment systems, in particular, suggest that they are teaching subject matter in more depth than they did in the past; at the same time, however, these teachers suggest that they have had to curtail the coverage of some content areas. In some instances, teachers also are involved in more interdisciplinary, thematic teaching of subject matter.

Changes in Instruction

Teachers implementing portfolios, extended performance tasks, or both, almost universally assert that they are increasingly emphasizing the following in their classrooms: research and performance-based project work; writing skills; and group work. Teachers working with performance assessment systems that comprise on-demand tasks have not made the same changes in their instructional practices. Teachers who are making these shifts in their instructional strategies also say that they are sharing scoring criteria (in the form of rubrics) with their students.

In several schools implementing performance assessment systems comprising portfolios, long-term research projects, or exhibitions of student work, teachers say they are asking students to write more and to conduct more research-based assignments than they did in the past. Such an instructional shift is driven by the requirements of the assessments themselves: teachers must design and assign tasks that enable students to demonstrate their writing capabilities or research and presentation skills.

Quality of Change

Two related findings underscore the difficulty in judging the quality of the pedagogical shifts observed. The first is that, because teachers are still learning how to incorporate performance assessments into their classrooms, they themselves find it difficult to evaluate any relationship between the pedagogical change and students' learning. The second reason rests in unclear, unarticulated, or variable standards for performance. In the cases of several district- and state-level assessment systems, the content and performance standards associated with the systems are not clear at the local level; therefore, teachers are making a pedagogical shift, but they are uncertain to what end. In contrast, in the cases of many school-level assessment systems or schools participating in national systems, teachers frequently individualize performance requirements for their students, making it similarly difficult to evaluate the extent to which the performance assessment system is challenging all students to meet equally high standards.

Impact on Learning

Teachers, students, and parents alike commented about the effects performance assessments, and changes in teaching practices that accompany them, are having on student learning. In particular, they noted that students' motivation to learn and their writing and critical-thinking skills have been affected for the better. However, evidence concerning the effects of performance assessments on student learning remains largely anecdotal.

Motivation to Learn

Students exhibit a greater motivation to learn and a greater amount of engagement with performance tasks and portfolio assignments than with other types of assessments. According to both students and teachers, this effect is due to the sustained attention and effort students must invest in these tasks, as they simultaneously define the parameters of their work and determine its quality.

Writing and Critical-Thinking Skills

Teachers say that students are improving their writing skills and habits as a function of the writing assignments they complete for various assessment tasks. Additionally, in schools where assessments are geared toward evaluating research and problem-solving skills, students and teachers report that students have acquired good research and analytical skills in the process of completing assessment tasks. Students are better able to use resource materials for projects and also have developed project presentation skills, such as the ability to summarize their work for an audience.

Implications for Policy and Future Research

Several implications for policy and future research emerge from the study's analysis of performance assessments systems, both with respect to the stated purposes of assessment (i.e., monitoring student progress; alignment of curriculum, instruction, and assessment; and accountability) and with respect to the larger purpose of improving teaching and learning.

Policy Implications

Several implications of our study findings should guide policy makers as they move to develop and implement performance assessment systems. These policy implications are organized into three categories: general policy implications, policy implications if the purpose of assessment reform is to improve and inform instruction and curriculum, and policy implications if the purpose of assessment reform is to hold schools or districts accountable for student achievement.

General Policy Implications

  1. Clearly state the primary purpose of the assessment system.
  2. Match the format of the assessment systems with the purpose of the assessment system.
  3. Coordinate assessment reform with other elements of education reform and with other testing requirements.
  4. Articulate in clear and simple terms the content and performance standards the assessment system is intended to measure.
  5. Institute procedures to ensure the technical quality and fairness of the assessment system.
  6. In order to obtain a comprehensive picture of student learning, design a performance assessment system that contains a mix of different types of performance assessment tasks and scoring procedures.
  7. Design an assessment system composed of assessments that reinforce each other and are based upon the same learning outcomes.
  8. Tap existing resources when developing performance-based assessments and coordinated reforms.
  9. Plan the timeline of reform keeping in mind the length of time required to institutionalize the change.
  10. Communicate to the public the purposes of and the theory underlying the assessment.
  11. Provide and encourage professional development activities that help teachers expand their capacity to work with performance-based assessment techniques.
  12. Involve teachers in the design and implementation of the system, and make the system as loosely prescribed as possible within the context of the purposes of the assessment.
  13. Encourage schools to provide teachers with time to develop assessments and to discuss experiences with assessments with colleagues.
  14. Provide waivers from testing and reporting requirements to schools experimenting with innovative assessment techniques.

Improve and Inform Instruction and Curriculum

  1. Design a moderately to loosely prescribed assessment system to enable teachers to actually work with the system in their classrooms. Also design assessments that can be easily integrated into different subject areas and into the school day.
  2. Provide clearly written content frameworks, performance standards, and assessment guidelines to teachers and administrators.
  3. Pay more attention to issues of content quality and curriculum and assessment coordination than to attaining interrater reliability.
  4. Provide on-going professional development in the design, use, and scoring of the new assessments and also in how to use new pedagogical strategies that align with the new assessments, how to choose new curricular materials, and how to use new pedagogies with all students.
  5. Since developing and using new assessments and associated curricula takes time, consider changing the structure of the school day. Devise new schedules through block-scheduling, team-teaching approaches, extended school days, and other methods that allow teachers more time to learn and to teach their peers. Teachers must have the time to use, reflect upon, and discuss new reforms.

School or District Accountability

  1. In order to ensure the rigor of the assessment system in measuring whether or not standards have been met, design a tightly to moderately prescribed assessment system.
  2. Institutionalize rigorous quality assurance procedures to ensure the content validity, intertask-reliability, fairness, and interrater reliability of the assessments.
  3. Public information is an important component of the accountability mechanism. The "public" includes parents, school board officers, and legislators, and information must be tailored to fit each group's information needs.
  4. If the assessment system is to be used for high stakes accountability, also collect information on known correlates of student performance on assessments.

Implications for Future Research

The following eight topics deserve the attention of the future researchers:

  1. Continued research into how the technical properties and fairness of performance assessment systems can be improved.
  2. Research into the most effective combinations of instructional models and assessments (including multiple choice tests) that result in improved student learning.
  3. Longitudinal research of facilitators and barriers in assessment reform.
  4. Research into how different types of performance assessments are or are not appropriate for assessing the progress of children with disabilities.
  5. Research into the types of professional development and support activities that best enable teachers to understand and implement different types of performance assessments.
  6. Research into the impact of the use of performance assessments and related teaching strategies on student learning.
  7. Research into how opportunity-to-learn factors affect disadvantaged students' performance on different types of performance assessments.
  8. Research into the long-term benefits of the use of performance assessments as compared with the long-term costs of developing and implementing performance assessments.


1Schools working with national-level reform efforts are subsumed under this subsample of sites because the schools participating in the reform efforts typically have their own purposes for undertaking assessment reform and use their participation in the national-level effort as a point of departure for assessment reform, not as an end in itself, for assessment reform.


-###-


[Acknowledgments] [Contents] [Chapter 1: Introduction Part 1 of 2]