Assessment: Measure What Matters
Goal: Our education system at all levels will leverage the power of technology to measure what matters and use assessment data for continuous improvement.
Most of the assessment done in schools today is after the fact and designed to indicate only whether students have learned. Little is done to assess students' thinking during learning so we can help them learn better. Nor do we collect and aggregate student-learning data in ways that make the information valuable to and accessible by educators, schools, districts, states, and the nation to support continuous improvement and innovation. We are not using the full flexibility and power of technology to design, develop, and validate new assessment materials and processes for both formative and summative uses.
Just as learning sciences and technology play an essential role in helping us create more effective learning experiences, when combined with assessment theory they also can provide a foundation for much-needed improvements in assessment (Pellegrino, Chudowsky, and Glaser 2001; Tucker 2009). These improvements include finding new and better ways to assess what matters, doing assessment in the course of learning when there is still time to improve student performance, and involving multiple stakeholders in the process of designing, conducting, and using assessment.
Equally important, we now are acutely aware of the need to make data-driven decisions at every level of our education system on the basis of what is best for each and every student—decisions that in aggregate will lead to better performance and greater efficiency across the entire system.
What We Should Be Assessing
"I'm calling on our nation's governors and state education chiefs to develop standards and assessments that don't simply measure whether students can fill in a bubble on a test, but whether they possess 21st century skills like problem-solving and critical thinking and entrepreneurship and creativity."
—President Barack Obama,
Address to the Hispanic Chamber of Commerce, March 10, 2009
President Obama issued this challenge to change our thinking about what we should be assessing. Measuring these complex skills requires designing and developing assessments that address the full range of expertise and competencies implied by the standards. Cognitive research and theory provide rich models and representations of how students understand and think about key concepts in the curriculum and how the knowledge structures we want students to have by the time they reach college develop over time. An illustration of the power of combining research and theory with technology is provided by the work of Jim Minstrell, a former high school physics teacher who developed an approach to teaching and assessment that carefully considers learners' thinking.
Minstrell's work began with a compilation of student ideas about force and motion based on both the research literature and the observations of educators. Some of these student ideas, or "facets" in Minstrell's terminology, are considered scientifically correct to the degree one would expect at the stage of introductory physics. Others are partially incorrect, and still others are seriously flawed. Using these facets as a foundation, Minstrell designed a Web-based assessment program with sets of questions that can be used to inform learning about force and motion rather than simply test how much students have learned (Minstrell and Kraus 2005). Minstrell's facet assessments and instructional materials are available on the Web (http://www.diagnoser.com).
Technology Supports Assessing Complex Competencies
As Minstrell's and others' work shows, through multimedia, interactivity, and connectivity it is possible to assess competencies that we believe are important and that are aspects of thinking highlighted in cognitive research. It also is possible to directly assess problem-solving skills, make visible sequences of actions taken by learners in simulated environments, model complex reasoning tasks, and do it all within the contexts of relevant societal issues and problems that people care about in everyday life (Vendlinski and Stevens 2002).
Other technologies enable us to assess how well students communicate for a variety of purposes and in a variety of ways, including in virtual environments. An example of this is River City, a decade-long effort at Harvard University funded by the NSF. River City is a multiuser virtual environment designed by researchers to study how students learn through using it (Dede 2009). This virtual environment was built as a context in which middle school students could acquire concepts in biology, ecology, and epidemiology while planning and implementing scientific investigations in a virtual world.
River City takes students into an industrial city at the time in the 18th century when scientists were just beginning to discover bacteria. Each student is represented as an avatar and communicates with other student avatars through chat and gestures. Students work in teams of three, moving through River City to collect data and run tests in response to the mayor's challenge to find out why River City residents are falling ill. The student teams form and test hypotheses within the virtual city, analyze data, and write up their research in a report they deliver to the mayor.
Student performance in River City can be assessed by analyzing the reports that are the culmination of their experiences and also by looking at the kinds of information each student and each student team chose to examine and their moment-to-moment movements, actions, and utterances. On the basis of student actions in River City, researchers developed measures of students' science inquiry skills, sense of efficacy as a scientist, and science concept knowledge (Dede 2009; Dieterle 2009). Materials and other resources have been developed to support educators in implementing River City in their classrooms.
As the River City example illustrates, just as technology has changed the nature of inquiry among professionals, it can change how the corresponding academic subjects can be taught and tested. Technology allows representation of domains, systems, models, and data and their manipulation in ways that previously were not possible. Technology enables the use of dynamic models of systems, such as an energy-efficient car, a recycling program, or a molecular structure. Technology makes it possible to assess students by asking them to design products or experiments, to manipulate parameters, run tests, record data, and graph and describe their results.
Another advantage to technology-based assessments is we can use them to assess what students learn outside school walls and hours as well as inside. Assuming that we have standards for the competencies students must have and valid, reliable techniques for measuring these competencies, technology can help us assess (and reward) learning regardless of when and where it takes place.
The National Assessment of Educational Progress (NAEP) has designed and fielded several technology-based assessments involving complex tasks and problem situations (Bennett et al. 2007). (See sidebar on technology-based assessment using a hot-air balloon simulation.)
Technology-based Assessment Using a Hot-air Balloon Simulation
The National Assessment of Educational Progress (NAEP) has been exploring the use of more complex assessment tasks enabled by technology. In one technology-based simulation task, for example, eighth-graders are asked to use a hot air balloon simulation to design and conduct an experiment to determine the relationship between payload mass and balloon altitude. After completing the tutorial about the simulation tool interface, students select values for the independent variable payload mass. They can observe the balloon rise in the flight box and note changes in the values of the dependent variables of altitude, balloon volume, and time to final altitude.
In another problem, the amount of helium, another independent variable, is held constant to reduce the task's difficulty. Students can construct tables and graphs and draw conclusions by clicking on the buttons below the heading Interpret Results. As they work with the simulation, students can get help if they need it: A glossary of science terms, science help about the substance of the problem, and computer help about the buttons and functions of the simulation interface are built in to the technology environment. The simulation task takes 60 minutes to complete, and student performance is used to derive measures of the student's computer skills, scientific inquiry exploration skills, and scientific inquiry synthesis skills within the context of physics.
Growing recognition of the need to assess complex competencies also is demonstrated by the Department's Race to the Top Assessment Competition. The 2010 competition challenged teams of states to develop student assessment systems that assess the full range of standards, including students' ability to analyze and solve complex problems, synthesize information, and apply knowledge to new situations. The Department of Education urged participants in this competition to take advantage of the capabilities of technology to provide students with realistic, complex performance tasks; provide immediate scoring and feedback; and incorporate accommodations that make the assessments usable by a diverse array of students (Weiss 2010)
Using Technology to Assess in Ways That Improve Learning
There is a difference between using assessments to determine what students have learned for grading and accountability purposes (summative uses) and using assessments to diagnose and modify the conditions of learning and instruction (formative uses). Both uses are important, but the latter can improve student learning in the moment (Black and Wiliam 1998). Concepts that are widely misunderstood can be explained and demonstrated in a way that directly addresses students' misconceptions. Strategic pairing of students who think about a concept in different ways can lead to conceptual growth for both of them as a result of experiences trying to communicate and support their ideas.
Assessing in the Classroom
Educators routinely try to gather information about their students' learning on the basis of what students do in class. But for any question posed in the classroom, only a few students respond. Educators' insight into what the remaining students do and do not understand is informed only by selected students' facial expressions of interest, boredom, or puzzlement.
To solve this problem, a number of groups are exploring the use of various technologies to "instrument" the classroom in an attempt to find out what students are thinking. One example is the use of simple response devices designed to work with multiple-choice and true-false questions. Useful information can be gained from answers to these types of questions if they are carefully designed and used in meaningful ways. Physics professor Eric Mazur poses multiple-choice physics problems to his college classes, has the students use response devices to answer questions, and then has them discuss the problem with a peer who gave a different answer. Mazur reports much higher levels of engagement and better student learning from this combination of a classroom response system and peer instruction (Mazur 1997).
Science educators in Singapore have adopted a more sophisticated system that supports peer instruction by capturing more complex kinds of student responses. Called Group Scribbles, the system allows every student to contribute to a classroom discussion by placing and arranging sketches or small notes (drawn with a stylus on a tablet or handheld computer) on an electronic whiteboard. One educator using Group Scribbles asked groups of students to sketch different ways of forming an electric circuit with a light bulb and to share them by placing them on a whiteboard. Students learned by explaining their work to others and through providing and receiving feedback (Looi, Chen, and Ng 2010). (See sidebar on networked graphing calculators for another example of how a technology-based assessment can be used to adjust instruction.)
Using Networked Graphing Calculators for Formative Assessment
Over a wireless network, students can contribute mathematical content to the classroom, such as algebraic functions or graphs—content that is much richer than the answer to a multiple-choice question.
Mrs. J, an experienced science teacher in an urban middle school, participated in a large field trial testing the effectiveness of networked graphing calculators. When district-level tests had revealed that her students struggled to interpret graphs, Mrs. J used the graphing calculator-based wireless system to implement weekly practice on graph interpretations, overcoming her initial feeling that "technology is just overwhelming." She reported, "I have taught for 18 years and I have been in seventh-grade science for about 15 of the 18 ... and there are things that I have always been really sure that ... kids have understood completely. Now I see what they are thinking. And I am like, whoa, I am just amazed."
Mrs. J used the insights into her students' misunderstandings as revealed by the graphs they constructed to guide her instructional decisions.
Mrs. J also found the classroom network technology beneficial for providing specific help for individual students: "We were doing earth and sun relationships ... revolution versus rotation ... and ... I was able to ... see who was making those mistakes still.... So it helped me because I could pinpoint [students' weaknesses] without embarrassing them."
Assessing During Online Learning
When students are learning online, there are multiple opportunities to exploit the power of technology for formative assessment. The same technology that supports learning activities gathers data in the course of learning that can be used for assessment (Lovett, Meyer, and Thille 2008). An online system can collect much more and much more detailed information about how students are learning than manual methods. As students work, the system can capture their inputs and collect evidence of their problem-solving sequences, knowledge, and strategy use, as reflected by the information each student selects or inputs, the number of attempts the student makes, the number of hints and type of feedback given, and the time allocation across parts of the problem.
The ASSISTment system, currently used by more than 4,000 students in Worcester County Public Schools in Massachusetts, is an example of a Web-based tutoring system that combines online learning and assessment activities (Feng, Heffernan, and Koedinger 2009). The name ASSISTment is a blend of tutoring "assistance" with "assessment" reporting to educators. The ASSISTment system was designed by researchers at Worcester Polytechnic Institute and Carnegie Mellon University to teach middle school math concepts and to provide educators with a detailed assessment of students' developing math skills and their skills as learners. It gives educators detailed reports of students' mastery of 100 math skills, as well as their accuracy, speed, help-seeking behavior, and number of problem-solving attempts. The ASSISTment system can identify the difficulties that individual students are having and the weaknesses demonstrated by the class as a whole so that educators can tailor the focus of their upcoming instruction.
When students respond to ASSISTment problems, they receive hints and tutoring to the extent they need them. At the same time, how individual students respond to the problems and how much support they need from the system to generate correct responses constitute valuable assessment information. Each week, when students work on the ASSISTment website, the system "learns" more about the students' abilities and thus can provide increasingly appropriate tutoring and can generate increasingly accurate predictions of how well the students will do on the end-of-year standardized test. In fact, the ASSISTment system has been found to be more accurate at predicting students' performance on the state examination than the pen-and-paper benchmark tests developed for that purpose (Feng, Heffernan, and Koedinger 2009).
How Technology Supports Better Assessment
Adaptive Assessment Facilitates Differentiated Learning
As we move to a model where learners have options in terms of how they learn, there is a new role for assessment in diagnosing how best to support an individual learner. This new role should not be confused with computerized adaptive testing, which has been used for years to give examinees different assessment items depending on their responses to previous items on the test in order to get more precise estimates of ability using fewer test items.
Adaptive assessment has a different goal. It is designed to identify the next kind of learning experience that will most benefit the particular learner. The School of One demonstration project (see the sidebar on the School of One in the Learning section) used adaptive assessment to differentiate learning by combining information from inventories that students completed on how they like to learn with information on students' actual learning gains after different types of experiences (working with a tutor, small-group instruction, learning online, learning through games). This information was used to generate individual "playlists" of customized learning activities for every student. (See the sidebar on meshing learning and assessment for an example of adaptive assessment in higher education.)
Meshing Learning and Assessment in Online and Blended Instruction
The online learning systems being developed through the Open Learning Initiative (OLI) at Carnegie Mellon University illustrate the advantages of integrating learning and assessment activities. The OLI R&D team set out to design learning systems incorporating the learning science principle of providing practice with feedback. In the OLI courses, feedback mechanisms are woven into a wide variety of activities. In a biology course, for example, there are
Interactive simulations of biological processes that students can manipulate; the student's interaction with the simulation is interspersed with probes to get at his or her understanding of how it works;
"Did I Get This?" quizzes following presentation of new material so that students can check for themselves whether or not they understood, without any risk of hurting their course grade;
Short essay questions embedded throughout the course material that call on students to make connections across concepts; and
"Muddiest Point" requests that ask students what they thought was confusing.
Tutored problem solving gives students a chance to work through complex problems with the opportunity to get scaffolds and hints to help them. The students receive feedback on their solution success after doing each problem, and the system keeps track of how much assistance students needed for each problem as well as whether or not they successfully solved it.
When OLI courses are implemented in a blended instruction mode that combines online and classroom learning, the instructor can use the data that the learning system collects as students work online to identify the topics students most need help on so that they can focus upcoming classroom activities on those misconceptions and errors (Brown et al. 2006). OLI is now doing R&D on a digital dashboard to give instructors an easy-to-read summary of the online learning data from students taking their course.
The OLI has developed learning systems for engineering statics, statistics, causal reasoning, economics, French, logic and proofs, biology, chemistry, physics, and calculus. A study contrasting the performance of students randomly assigned to the OLI statistics course with those in conventional classroom instruction found that the former led to better student learning outcomes in half the time (Lovett, Meyer, and Thille 2008).
Universal Design for Learning and Assistive Technology Improve Accessibility
Technology allows the development of assessments using Universal Design for Learning principles that make assessments more accessible, effective, and valid for students with greater diversity in terms of disability and English language capability. (See the sidebar on universal design for textbooks in the Learning section.)
Most traditional tests are written in English and can be taken only by sighted learners who are fluent in English. Technology allows for presentation and assessment using alternative representations of the same concept or skill and can accommodate various student disabilities and strengths. Moreover, having the option of presenting information through multiple modalities enlarges the proportion of the population that can be assessed fairly.
Technology also can support the application of UDL principles to assessment design. For example, the Principled-Assessment Designs for Inquiry (PADI) system developed by Geneva Haertel, Robert Mislevy, and associates (Zhang et al. 2010) is being used to help states develop science assessment items that tap the science concepts the states want to measure and minimize the influence of such extraneous factors as general English vocabulary or vision. Technology can support doing this labor-intensive work more efficiently and provides a record of all the steps taken to make each assessment item accessible and fair for the broadest number of students.
Similarly, assistive technology can make it possible for students who have disabilities that require special interfaces to interact with digital resources to demonstrate what they know and can do in ways that would be impossible with standard print-based assessments. Designing assessments to work with assistive technologies is much more cost-effective than trying to retrofit the assessments after they have been developed.
Technology Speeds Development and Testing of New Assessments
One challenge associated with new technology-based assessments is the time and cost of development, testing for validity and reliability, and implementation. Here, too, technology can help. When an assessment item is developed, it can be field-tested automatically by putting it into a Web-based learning environment with thousands of students responding to it in the course of their online learning. Data collected in this way can help clarify the inferences derived from student performance and can be used to improve features of the assessment task before its large-scale use.
Technology Enables Broader Involvement in Providing Feedback
Some performances are so complex and varied that we do not have automated scoring options at present. In such cases, technology makes it possible for experts located thousands of miles away to provide students with authentic feedback. This is especially useful as educators work to incorporate authentic problems and access to experts into their instruction.
The expectation of having an audience outside the classroom is highly motivating for many students. Students can post their poems to a social networking site or make videotaped public service announcements for posting on video-sharing sites and get comments and critiques. Students who are developing design skills by writing mobile device applications can share their code with others, creating communities of application developers who provide feedback on each other's applications. The number of downloads of their finished applications provides one way of measuring success.
For many academic efforts, the free-for-all of the Internet would not provide a meaningful assessment of student work, but technology can support connections with online communities of individuals who do have the expertise and interest to be judges of students' work. Practicing scientists can respond to student projects in online science fairs. Readers of online literary magazines can review student writing. Professional animators can judge online filmmaking competitions. Especially in contests and competitions, rubrics are useful in communicating expectations to participants and external judges and in helping promote judgment consistency.
Technology also has the potential to make both the assessment process itself and the data resulting from that process more transparent and inclusive. Currently, only average scores and proficiency levels on state assessments are widely available through both public and private systems. Still, parents, policymakers, and the public at large can see schools' and districts' test scores and in some instances test items. This transparency increases public understanding of the current assessment system.
Technology Could Reduce Test Taking for Accountability Only
Many educators, parents, and students are concerned with the amount of class time devoted to taking tests for accountability purposes. Students not only are completing the tests required every year by their states, but they also are taking tests of the same content throughout the year to predict how well they will perform on the end-of-year state assessment (Perie, Marion, and Gong 2009).
When teaching and learning are mediated through technology, it is possible to reduce the number of external assessments needed to audit the education system's quality. Data streams captured by an online learning system can provide the information needed to make judgments about students' competencies. These data-based judgments about individual students could then be aggregated to generate judgments about classes, schools, districts, and states.
West Virginia uses this strategy in its assessment of students' technology skills. (See sidebar on moving assessment data from the classroom to the state.)
Moving Assessment Data From the Classroom to the State
West Virginia's techSteps program is an example of an assessment system coordinated across levels of the education system. techSteps is organized around six technology integration activities per grade level. The activities are sequenced to introduce technology skills developmentally and in a 21st-century context and are largely open-ended and flexible, so they can be integrated into county and school curricula.
Each techSteps activity includes a classroom assessment rubric. After a student completes a techSteps activity, the teacher enters an assessment of his or her performance against the rubric into the techSteps website. techSteps uses the teacher-completed rubric form to identify the target skills demonstrated by that student and uses this information to build the student's Technology Literacy Assessment Profile.
Through techSteps, West Virginia is able to have statewide student data on technology proficiencies at each grade level without requiring a separate "drop-in-from-the-sky" technology test.
Prospects for Electronic Learning Records
Much like electronic medical records in this country, electronic learning records could stay with students throughout their lives, accumulating evidence of growth across courses and across school years. A logical extension of online grade books and other electronic portfolios, electronic learning records would include learning experiences and demonstrated competencies, including samples of student work.
Many schools are using electronic portfolios of students' work as a way to demonstrate what they have learned. (See sidebar on how New Tech High School uses technology to document student accomplishments.) Although students' digital products are often impressive on their face, a portfolio of student work should be linked to an analytic framework if it is to serve assessment purposes. The portfolio reviewer needs to know what competencies the work is intended to demonstrate, what the standard or criteria for competence are in each area, and what aspects of the work provide evidence of meeting those criteria. Definitions of desired outcomes and criteria for levels of accomplishment can be expressed in the form of rubrics.
An advantage of using rubrics is that they can be communicated not only to the people judging students' work, but also to the students themselves. When students receive assessment rubrics before doing an assignment—and especially when students participate in developing the rubrics—they can develop an understanding of how quality is judged in the particular field they are working in (for example, an essay of literary criticism, the design of a scientific experiment, or a data analysis).
As with any other kind of assessment score, ratings derived from rubrics should be both valid (demonstrated to measure what they are intended to measure) and reliable (consistent no matter who the rater is). Before rubrics are used on a larger scale for assessments that have consequences for schools and students, their validity and reliability must be established.
Using Assessment Data to Drive Continuous Improvement
Once we have assessments in place that assess the full range of expertise and competencies reflected in standards, we could collect student learning data and use the data to continually improve learning outcomes and productivity. For example, such data could be used to create a system of interconnected feedback for students, educators, parents, school leaders, and district administrators.
The goal of creating an interconnected feedback system would be to ensure that key decisions about learning are informed by data and that data are aggregated and made accessible at all levels of the education system for continuous improvement. The challenge associated with this idea is to make relevant data available to the right people, at the right time, and in the right form.
For example, assessment data should be made available to students so they can play a larger role in directing their own learning, as demonstrated by New Tech High and its use of online grade books. (See the sidebar on New Tech High School).
New Tech High School: Supporting Student Use of Assessment Results Using Technology
New Tech High School in Napa Valley, Calif., has been using innovative technology-based assessment practices since the school was founded in 1996. Instruction at the school emphasizes project-based learning, with students tackling complex, interdisciplinary problems and typically working in groups. New Tech instructors design these projects around both required content standards and core learning outcomes that cut across academic content areas, including collaboration, critical thinking, oral and written communication, use of technology, and citizenship.
By using this common framework in assessing student's work across classes and grade levels, New Tech teachers provide more useful information than could be obtained from a summary grade alone. Assessments of writing skill, for example, are aggregated across all projects and all courses so that teachers, parents, and the students themselves can get a view of strengths and weaknesses across multiple contexts. In addition, these assessments are made available to students in online grade books that are continually updated so that students can see how they are doing not only in each course, but also on each of the learning outcomes, averaged across all their courses. Electronic learning portfolios contain examples of students' work and associated evaluations also across all classes and grades.
In addition to receiving performance ratings from their teachers and peers, students at New Tech do postproject self-assessments on completion of major projects. These assessments provide feedback for the teacher who designed the project and an opportunity for the student to think about the project experience. The postproject self-assessment template guides the student in reflecting on how successful the project was in terms of the material learned, engagement (interest, relevance), process (for example, how clear project instructions were and whether sufficient scaffolding for student work was available), and self (extent to which the student fulfilled tasks assigned within the group and showed a solid work ethic).
Assessment data also should be used to support educators' efforts to improve their professional practice. Data from student assessments can enable teachers to become more effective by giving them evidence about the effectiveness of the things they do.
In addition, teams of educators reflecting on student data together can identify colleagues who have the most success teaching particular competencies or types of students, and then all team members can learn from the practices used by their most effective colleagues (Darling-Hammond 2010; U.S. Department of Education Office of Planning, Evaluation, and Policy Development 2010). Using student data in this way also could improve educators' collaboration skills and skills in using data to improve instruction. At times, it might be useful to have educators use common assessments to facilitate this kind of professional learning.
The same student-learning data that guide students and educators in their decision making can inform the work of principals and district administrators. Administrators and policymakers should be able to mine assessment data over time to examine the effectiveness of their programs and interventions.
The need for student data plays out at the district level as well. Districts adopt learning interventions they believe will address specific learning needs, but these interventions often rely on untested assumptions and intuition. In a data-driven continuous improvement process, the district could review data on the intervention's implementation and student-learning outcomes after each cycle of use and then use the data as the basis for refining the learning activities or supports for their implementation to provide a better experience for the next group of students. (See sidebar on using technology to make the link between assessment data and instructional resources in Fairfax County, Va.)
Using Technology to Make the Link Between Assessment Data and Instructional Resources
To encourage teachers to make formative use of assessment data, Fairfax County Public Schools (FCPS), Va., developed eCART (Electronic Curriculum Assessment Resource Tool). This Web-based system allows teachers to access everything, from lesson plans to assessment tools, all in one place.
eCART's searchable database provides access to district-approved resources and curriculum correlated to specific standards, benchmarks, and indicators. It allows teachers to create assessments using varied combinations of FCPS common assessment items.
The eCART assessment items were developed by district teachers and designed to provide diagnostic information. The assessments are used to reveal student misconceptions and skills that need to be reinforced.
Using assessment results for their students, Fairfax teachers can follow links to a large library of instructional resources, including supplementary materials, lesson plans, work sheets, and Web links. Students can take eCART assessments online or using pencil and paper.
Student eCART assessment results are stored in the district's data system so that classroom assessment data can be viewed along with benchmark assessment data and results from state tests. Having a common set of formative assessments enables comparisons of student performance across classrooms and schools.
As good as technology-based assessment and data systems might be, educators need support in learning how to use them. An important direction for development and implementation of technology-based assessment systems is the design of technology-based tools that can help educators manage the assessment process, analyze data, and take appropriate action.
Advancing Technical and Regulatory Practice
Two types of challenges to realizing the vision of sharing data across systems are technical and regulatory. On the technical front, multiple student data systems, the lack of common standards for data formats, and system interoperability pose formidable barriers to the development of multilevel assessment systems. For example, student and program data today are collected at various levels and in various amounts to address different needs in the educational system. State data systems generally provide macro solutions, institution-level performance management systems are micro solutions, and student data generated by embedded assessment are nano solutions. Providing meaningful, actionable information that is collected across multiple systems will require building agreement on the technical format for sharing data.
To assist with these efforts, the National Center for Education Statistics at the Department of Education has been leading the Common Data Standards (CDS) Initiative, a national, collaborative effort to develop voluntary, common data standards. The CDS Initiative's objective is to help state and local education agencies and higher education organizations work together to identify a minimal set of key data elements, common across organizations and necessary to meet student, policymaker, and educator needs, and come to agreement on definitions, business rules and technical specifications, when possible, to improve the comparability and share-ability of those elements. (Note: Version 1.0 of CDS was released on Sept. 10, 2010.)
As the reliance on data to inform decisions, the public's demand for transparency and accountability, and the world of technology grow exponentially, we must stay vigilant in our efforts to protect student privacy. On the regulatory front, regulations such as the Family Educational Rights and Privacy Act (FERPA) serve the important purpose of protecting student privacy but also can present challenges, if not properly understood and implemented. Much of the confusion surrounding research and data sharing posed by FERPA in its original form was reduced or eliminated through a 2008 revision of FERPA regulations. Still, varying interpretations of FERPA requirements and differences in district and state policies have made data sharing a complex, time-consuming, and expensive process. (See the sidebar on FERPA.)
The Family Educational Rights and Privacy Act (FERPA) is a federal law that protects student privacy by prohibiting the disclosure of personally identifiable information from education records without prior written consent, except as set forth in 34 CFR § 99.31. FERPA also allows parents and "eligible students" (defined as students who are age 18 or over or who attend post-secondary institutions) to inspect and review their education records and to request that inaccuracies in their records be corrected.
Advance written consent is generally required to disclose student-level information from education records, such as student grades, if the information would be linked or linkable to a specific student by a reasonable member of the school community. However, schools may non-consensually disclose basic "directory information" such as student names and phone numbers, if schools give public notice to parents and eligible students that they are designating this type of information as "directory information" and provide parents and eligible students a reasonable period of time after such notice has been given to opt out. Another exception to the requirement of prior, written consent permits teachers or administrators within the same school or school district who have a legitimate educational interest in the student's record to access personally identifiable student data.
In 2008, the FERPA regulations were updated to address the conditions under which FERPA permits the non-consensual disclosure of personally identifiable information from education records for research. In this respect, the regulations were amended to permit the release of de-identified records and information, which requires the redaction of all personally identifiable information per 34 CFR § 99.31(b). The 2008 final regulations also specified the conditions under which states or other state educational authorities that have legal authority to enter into agreements for local educational agencies (LEAs) or post-secondary institutions may enter into agreements to non-consensually disclose personally identifiable information from education records to an organization conducting a study for the LEA or institution under 34 CFR § 99.31(a)(6). The updates to the 2008 FERPA regulations also addressed the non-consensual disclosure of personally identifiable information from education records in a health or safety emergency in 34 CFR §§ 99.31(a) (10) and 99.36.
Source: U.S. Department of Education Family Policy Compliance Office 2010
Advancing the technical and regulatory practices to ensure privacy is maintained and information is secure while aggregating and sharing data would facilitate efficient use of data that are already being collected to make judgments about students' learning progress and the effectiveness of education programs.
Reaching Our Goal
Our education system at all levels will leverage the power of technology to measure what matters and use assessment data for continuous improvement.
To meet this goal, we recommend the following actions:
2.1 States, districts, and others should design, develop, and implement assessments that give students, educators, and other stakeholders timely and actionable feedback about student learning to improve achievement and instructional practices.
Learning science and technology combined with assessment theory can provide a foundation for new and better ways to assess students in the course of learning, which is the ideal time to improve performance. This will require involving experts from all three disciplines in the process of designing, developing, and using new technology-based assessments that can increase the quality and quantity of feedback to learners.
2.2 Build the capacity of educators, educational institutions, and developers to use technology to improve assessment materials and processes for both formative and summative uses.
Technology can support measuring performance that cannot be assessed with conventional testing formats, providing our education system with opportunities to design, develop, and validate new and more effective assessment materials. Building this capacity can be accelerated through knowledge exchange, collaboration, and better alignment between educators (practitioners) and the experts.
2.3 Conduct research and development that explores how embedded assessment technologies, such as simulations, collaboration environments, virtual worlds, games and cognitive tutors, can be used to engage and motivate learners while assessing complex skills.
Interactive technologies, especially games, provide immediate performance feedback so that players always know how they are doing. As a result, they are highly engaging to students and have the potential to motivate students to learn. They also enable educators to assess important competencies and aspects of thinking in contexts and through activities that students care about in everyday life. Because interactive technologies hold this promise, assessment and interactive technology experts should collaborate on research to determine ways to use them effectively for assessment.
2.4 Conduct research and development that explores how UDL can enable the best accommodations for all students to ensure we are assessing what we intend to measure rather than extraneous abilities a student needs to respond to the assessment task.
To be valid, an assessment must measure those qualities it is intended to measure, and scores should not be influenced by extraneous factors. An assessment of science, for example, should measure understanding of science concepts and their application, not the ability to see print, to respond to items using a mouse, or to use word processing skills. Test items and tasks should be designed from the outset to measure the knowledge, skills, and abilities that the test intends to assess and not the students' ability to read when assessing mathematics skills or to self-monitor when completing a science task that includes several steps. Assessment and technology experts should collaborate to create assessment design tools and processes that make it possible to develop assessment systems with appropriate features (not just accommodations) so that assessments capture examinees' strengths in terms of the qualities that the assessment is intended to measure.
2.5 Revise practices, policies, and regulations to ensure privacy and information protection while enabling a model of assessment that includes ongoing gathering and sharing of data for continuous improvement.
Every parent of a student under 18 and every student 18 or over should have the right to access the student's own assessment data in the form of an electronic learning record that the student can take with them throughout his or her educational career. At the same time, appropriate safeguards, including stripping records of identifiable information and aggregating data across students, classrooms, and schools, should be used to make it possible to supply education data derived from student records to other legitimate users without compromising student privacy.