This webpage is part of the Evaluating State Accountability Systems Under ESEA tool, which is designed to help state educational agency (SEA) staff reflect on how the state's accountability system achieves its intended purposes and build confidence in the state's accountability system design decisions and implementation activities. Please visit the tool landing page to learn more about this tool and how to navigate these modules.
A key part of validating a theory of action is to determine whether evidence confirms the assumptions and links between components that yield intended outcomes. For an SEA, the state's accountability system can be considered a measure that helps the public understand the degree to which schools and districts meet the state's educational objectives and priorities as well as a policy lever to incentivize actions that help achieve those same objectives and priorities.1 If a state can identify sufficient evidence to uphold the assumptions associated with the state's system of AMD, the state may consider the results of the state's system of AMD valid for identifying schools.
SEA staff may use the following reflection prompts to consider whether the evidence generated by the state's system of AMD supports the underlying rationale, and whether the SEA can be sufficiently confident that the state's system of AMD's pieces are working together as intended. Respond to the following prompts to engage in the reflection around the state's system of AMD's operations and results:
- Read the claim, consideration, and potential sources of evidence.
- Examine the specific evidence available in your state. Reflect on whether you believe you have collected enough evidence to be confident in the claim stated or whether there is a need for further examination.
- Finally, respond to questions that pose whether you (a) have sufficiently explored the confidence claims below and (b) believe that you have collected enough evidence that these claims can be confirmed. Some questions may be based on opinion, whereas others will require an examination of data, supplemental analyses, or conversations with other SEA colleagues.
You may print this webpage and use it as a template for note taking if working with colleagues.
For states with non-summative rating systems, please click on the link below to jump to the non-summative rating system reflection prompt section (Table 7).
For summative rating systems (e.g., index-based systems), please see the reflection prompts in Table 6.
Table 6: Confidence in the Operations and Results of the State's System of AMD for Summative Rating Systems
Claim 1: School rankings and groupings created using the state's system of AMD reflect data as intended and expected. | |||
---|---|---|---|
Consideration 1.1: Rankings generated through the state's system of AMD reflect expectations based on design and policy objectives. | |||
Consideration 1.2: Rankings generated through the state's system of AMD reflect expectations based on simulations and historical data. | |||
Consideration 1.3: School ratings align with outcome data exhibited by schools. | |||
Consideration 1.4: Groupings of school ratings align with outcome data (i.e., indicator performance) exhibited by schools. | |||
When compared with prior iterations of the state's accountability systems, the current iteration of the state's accountability system may have similar priorities or be drastically different from systems used in the past. How schools are ranked by overall index scores, for example, or grouped by an overall rating, such as a star rating or letter grade, is a key set of evidence to understanding how the state's system of AMD is functioning when operational. For each consideration, review the key questions presented, and use the key evidence checks to help answer those questions. | |||
Reflection Prompts | Notes | ||
Key questions: Are schools grouped, ranked, or clustered appropriately? | |||
Why is this important? Examining overall school scores or rating distributions is an important step in determining whether schools receive the ratings you would expect. Examine measures of central tendency to determine the range of school scores. | |||
Key evidence checks:
| |||
Potential next steps: Based on empirical analyses, consider the range of overall school scores or ratings. If schools are too tightly clustered, it may be difficult to differentiate schools meaningfully. Determine whether the lack of spread and/or differentiation is due to how indicators are weighted and/or how indicators are transformed. This is addressed in more detail in Modules 3A-3E: Indicators. | |||
Reflection Prompts | Notes | ||
Key questions: Have you conducted simulations using prior data to support comparisons with operational data? If so, are the results consistent? Are the results surprising or unexpected? If you have not run simulations, do you have sufficient historical accountability results to support comparisons with operational data? | |||
Why is this important? On average, school performance exhibits some consistency over time to help us identify notable changes in performance. A baseline can be a useful comparison for new systems. | |||
Key evidence checks:
| |||
Potential next steps:
| |||
Reflection Prompts | Notes | ||
Key questions: What is the relationship between overall school ratings/scores and indicator performance? | |||
Why is this important? The relationship between ratings and indicators can be a function of how indicators are weighted in the system. For example, variations in school ratings will differ when comparing a system that weights student academic growth more heavily with a system that weights student academic achievement more heavily. | |||
Key evidence checks:
| |||
Potential next steps:
| |||
Reflection Prompts | Notes | ||
Key questions: How are schools grouped by rating or overall score? Do these groupings result in commonalities or trends in school indicators? | |||
Why is this important? School groupings might provide insight into how many meaningful groups of schools exist. Although this is decidedly empirically driven, it can help inform our understanding about later examinations of differentiation and how empirical groupings compare with policy-driven categories. Discrepancies are not indicative of problems with state categories but may indicate that groupings are not as empirically different as expected. | |||
Key evidence checks:
| |||
Potential next steps:
| |||
Claim 1 Reflection Questions | Claim 1 Response | ||
Reflecting on your notes above, consider your confidence in responding to the reflection questions below. If you answer "no" or are not confident in your response, consider using Modules 3A-3E: Indicators to explore these topics in more depth. | |||
We have sufficiently explored the confidence claims above to understand how the state's system of AMD is grouping or ranking schools and whether it works as expected. | Yes / No | ||
We have collected enough evidence to sufficiently address key questions and can confirm the state's system of AMD reflects expectations based on design and policy objectives. | Yes / No |
Claim 2: Results from the state's system of AMD reflect meaningful differentiation among schools. | |||
---|---|---|---|
Consideration 2.1: The rating distribution across schools has face validity. | |||
Consideration 2.2: School and indicator scores or results are distributed at intervals that reflect meaningful differences. | |||
Consideration 2.3: The overall rating or indicator results at the lower and higher thresholds of each rating are reasonable and defensible. | |||
A key purpose of the state's system of AMD is grouping schools. This is usually a function of how performance standards are set (i.e., what constitutes an "A" rating, 5 stars, or some other top rating). However, it is important to determine how the standard-setting process impacts operational ratings of schools. We should try to understand both the "average" characteristics of schools that receive particular ratings, as well as the ways in which characteristics differ across ratings. For each consideration, review the key questions presented, and use the key evidence checks to help answer those questions. | |||
Reflection Prompts | Notes | ||
Key questions: To what extent are schools distributed across the available ratings in your system of state's system of AMD? | |||
Why is this important? Although a relatively straightforward examination, the distribution of schools across the possible ratings is an important piece of evidence to support the face validity of the state's system of AMD. Too many mismatched high- or low-rating schools can lead to misinterpretation by educators and the public. | |||
Key evidence checks:
| |||
Potential next steps: If performance distributions do not match expectations based on policy objectives and articulated expectations, consider revisiting performance standards to better align the design or clarify why there may be a mismatch between long-term goals under ESEA section 1111 and results. | |||
Reflection Prompts | Notes | ||
Key questions: To what extent do differences in school ratings reflect meaningful differences in indicator results? | |||
Why is this important? Although indicator results often serve as a proxy for behavioral characteristics, they are an important window into understanding performance in the present and over time. When exploring whether differences exist as an artifact of policy decisions, data characteristics, or both, various pieces of evidence should be examined. | |||
Key evidence checks:
| |||
Potential next steps:
| |||
Reflection Prompts | Notes | ||
Key questions: What are the ranges of performance within each school rating for overall scores and for each indicator, and are these ranges reasonable? | |||
Why is this important? The variability for schools receiving a particular rating will likely be greater for indicator performance than for overall performance. Understanding the range and characteristics of performance among schools receiving each rating can help us better understand differences in school performance. | |||
Key evidence checks:
| |||
Potential next steps:
| |||
Claim 2 Reflection Questions | Claim 2 Response | ||
Reflecting on your notes above, consider your confidence in responding to the reflection questions below. If you answer "no" or are not confident in your response, consider using Modules 3A-3E: Indicators to explore these topics in more depth. | |||
My state has sufficiently explored the confidence claims above to understand how the state's system of AMD is differentiating schools. | Yes / No | ||
We have collected enough evidence to sufficiently address key questions and can confirm that the state's system of AMD is differentiating schools appropriately. | Yes / No |
Claim 3: Results from the state's system of AMD align with objectives and policies around subgroups and school size/setting demographics as expected. | |||
---|---|---|---|
Consideration 3.1: School-level ratings align with objectives for subgroups. | |||
Consideration 3.2: The results from the state's system of AMD are not overly influenced by school size and the impact of location is understood. | |||
State systems of AMD often prioritize equity or equitable performance as an objective. It is important to understand the degree to which the state's accountability system and system of AMD results are sensitive to issues of equity. This may be explicitly or implicitly defined in identification decisions or decision rules. For each consideration, review the key questions presented and use the key evidence checks to help answer those questions. | |||
Reflection Prompts | Notes | ||
Key questions: How are objectives for subgroups embedded into performance expectations for indicators, overall school ratings, or long-term goals and measurements of interim progress? | |||
Why is this important? Identifying how subgroup performance, subgroup characteristics, and subgroup expectations relate to the state's system of AMD results is important to understanding how well the rationale behind the state's system of AMD can be confirmed. | |||
Key evidence checks:
| |||
Potential next steps:
| |||
Reflection Prompts | Notes | ||
Key questions: In addition to subgroup characteristics, to what extent are results from the state's system of AMD influenced by school size and location (e.g., rural or isolated schools)? | |||
Why is this important? Although location may be a function of other school characteristics, it is important to identify any cases where systematic trends emerge and whether the influence of school size or location is expected or by design. | |||
Key evidence checks:
| |||
Potential next steps:
| |||
Claim 3 Reflection Questions | Claim 3 Response | ||
Reflecting on your notes above, consider your confidence in responding to the reflection questions below. If you answer "no" or are not confident in your response, consider using Modules 3A-3E: Indicators to explore these topics in more depth. | |||
My state has sufficiently explored the confidence claims above to understand how the state's system of AMD system is detecting subgroup or school-size characteristics as intended. | Yes / No | ||
We have collected enough evidence to sufficiently address the key questions and can confirm the state's system of AMD is detecting subgroup or school-size characteristics as intended. | Yes / No |
[Click here to go back to the tool home page]
Table 7: Confidence in the Operations and Results of the State's System of AMD for Non-Summative Rating Systems
Claim 1: School results and groupings created via the state's system of AMD reflect data as intended and expected. | |||
---|---|---|---|
Consideration 1.1: School results from the state's system of AMD reflect expectations based on design and policy objectives. | |||
Consideration 1.2: School results from the state's system of AMD reflect expectations based on simulations and historical data. | |||
Consideration 1.3: School profiles align with outcome data exhibited by schools. | |||
Consideration 1.4: School-rating profiles align with outcome data (i.e., indicator performance) exhibited by schools. | |||
When compared with prior state accountability systems, the current iteration of the state's accountability system may have similar priorities or be drastically different. Understanding how schools are identified is a key piece of evidence to understanding how AMD is functioning when operational. For each consideration, review the key questions presented and use the key evidence checks to help answer those questions. | |||
Reflection Prompts | Notes | ||
Key questions: Do decision rules for the state's system of AMD result in reasonable distribution of results? | |||
Why is this important? Non-summative systems do not prioritize overall schools' scores, but still provide a wealth of data to the public and educators through the reporting of indicators. | |||
Key evidence checks:
| |||
Potential next steps:
| |||
Reflection Prompts | Notes | ||
Key questions: Have you conducted simulations using prior data to support comparisons with operational data once schools are identified? If you have not run simulations, do you have sufficient historical accountability results to support comparisons with operational data? | |||
Why is this important? With non-summative systems, the examination of performance over time is dependent on the indicator performance over time, which requires a baseline. | |||
Key evidence checks:
| |||
Potential next steps:
| |||
Reflection Prompts | Notes | ||
Key questions: How do the indicator results reflect expected and meaningful differences between schools identified for improvement and other schools across the state? | |||
Why is this important? Non-summative systems must still categorize schools identified in need of improvement (i.e., CSI, TSI, ATSI). These concepts are explored more deeply in Modules 3 (CSI) and 4 (TSI/ATSI). | |||
Key evidence checks:
| |||
Potential next steps:
| |||
Reflection Prompts | Notes | ||
Key questions: How well do data from the indicators support the grouping of schools? | |||
Why is this important? In the case of non-summative systems, it may be helpful to study the differences between identified and nonidentified schools. This examination can help provide insights into the discrepancies between policy and empirical observations. | |||
Key evidence checks:
| |||
Potential next steps:
| |||
Claim 1 Reflection Questions | Claim 1 Response | ||
Reflecting on your notes above, consider your confidence in responding to the reflection questions below. If you answer "no" or are not confident in your response, consider using Modules 3A-3E: Indicators to explore these topics in more depth. | |||
We have sufficiently explored the confidence claims above to understand how the state's system of AMD is grouping or ranking schools as expected. | Yes / No | ||
We have collected enough evidence to sufficiently address key questions and can confirm the state's system of AMD reflects expectations based on design and policy objectives. | Yes / No |
Claim 2: Results from the state's system of AMD reflect meaningful differentiation among schools. | |||
---|---|---|---|
Consideration 2.1: School profiles have face validity. | |||
Consideration 2.2: School/indicator results are distributed at intervals that reflect meaningful differences. | |||
Consideration 2.3: The overall or indicator results of schools at the lower and higher thresholds of each rating are reasonable and defensible. | |||
A key purpose of the state's system of AMD is determining how to assign accountability ratings to schools. This is usually a function of how performance standards are set. However, it is important to determine how the standard-setting process impacts which schools are identified for CSI, TSI, and ATSI. We should try to understand both the characteristics of schools within categories, as well as the way in which characteristics differ across categories. For each consideration, review the key questions presented and use the key evidence checks to help answer those questions. | |||
Reflection Prompts | Notes | ||
Key questions: To what extent is the distribution or groupings of schools reasonable when comparing identified and nonidentified schools? | |||
Why is this important? Many of the distinctions between identified and nonidentified schools will be based on the policy objectives; the order of decisions for CSI, TSI and ATSI identification; and the performance expectations for CSI, TSI and ATSI schools (see Module 4: CSI and Module 5: TSI and ATSI for more detail). | |||
Key evidence checks:
| |||
Potential next steps:
| |||
Reflection Prompts | Notes | ||
Key questions: How do the decision rules in your system affect CSI, TSI, and ATSI identification processes? | |||
Why is this important? School groupings in non-summative systems are primarily focused on how schools are identified for CSI, TSI, or ATSI. Understanding the distribution or groupings of schools (Claim 1 above) is key to understanding how data influence identification. | |||
Key evidence checks: The ideas around identification of the required categories of schools using the state's system of AMD are explored more deeply in the following modules:
| |||
Potential next steps:
| |||
Reflection Prompts | Notes | ||
Key questions: How much variability is present in school profiles based on indicator data? | |||
Why is this important? Because there will be a large range of performance across indicators, it is important to understand how to interpret the differences among those schools at the edge of performance thresholds for identification. | |||
Key evidence checks:
| |||
Potential next steps:
| |||
Claim 2 Reflection Questions | Claim 2 Response | ||
Reflecting on your notes above, consider your confidence in responding to the reflection questions below. If you answer "no" or are not confident in your response, consider using Modules 3A-3E: Indicators to explore these topics in more depth. | |||
My state has sufficiently explored the confidence claims above to understand how our AMD system is differentiating schools. | Yes / No | ||
We have collected enough evidence to sufficiently address key questions, and can confirm the state's system of AMD is differentiating schools appropriately. | Yes / No |
Claim 3: Results from the state's system of AMD align with objectives and policies around subgroups and school size/setting demographics as expected. | |||
---|---|---|---|
Consideration 3.1: School-level profile results align with objectives for subgroups. | |||
Consideration 3.2: School-level profile results are not overly influenced by school size and the impact of location is understood. | |||
State systems of AMD often prioritize equity or equitable performance as a policy objective. It is important to understand the degree to which the results from the state's system of AMD are sensitive to issues of equity. This may be explicitly or implicitly defined in identification decisions or decision rules. For each consideration, review the key questions presented and use the key evidence checks to help answer those questions. | |||
Reflection Prompts | Notes | ||
Key questions: To what extent are policy objectives for subgroup performance, identification, and improvement are addressed by the state's system of AMD? | |||
Why is this important? The identification of statutorily required school types, particularly TSI and ATSI (which are examined more deeply in Module 4), are particularly susceptible to changes in subgroup characteristics. | |||
Key evidence checks:
| |||
Potential next steps:
| |||
Reflection Prompts | Notes | ||
Key questions: In addition to subgroup characteristics, to what extent are school identifications (i.e., CSI, TSI, and ATSI) influenced by school size and location? | |||
Why is this important? Although some school characteristics may be related to location, it is important to identify any cases where systematic trends emerge and whether the influence of school size or location is expected or by design. | |||
Key evidence checks:
| |||
Potential next steps:
| |||
Claim 3 Reflection Questions | Claim 3 Response | ||
Reflecting on your notes above, consider your confidence in responding to the reflection questions below. If you answer "no" or are not confident in your response, consider using Modules 3A-3E: Indicators to explore these topics in more depth. | |||
My state has sufficiently explored the confidence claims above to understand how the state's system of AMD is detecting subgroup, school-size, or location characteristics as intended. | Yes / No | ||
We have collected enough evidence to sufficiently address the key questions and can confirm the AMD is detecting subgroup or school-size characteristics as intended. | Yes / No |
[Click here to continue on to Module 2B: Indicator Interaction in the State's System of AMD]
1 See Accountability Identification is only the Beginning: Monitoring and Evaluating Accountability Results and Implementation (link is external) from the Council of Chief State School Officers for more information. Please note: The inclusion of links to resources and examples do not reflect their importance, nor are they intended to represent or be an endorsement by the U.S. Department of Education (ED) of any views expressed, or materials provided. ED does not control or guarantee the accuracy, relevance, timeliness, or completeness of any outside information included in this document.