Skip to main content

Module 2A, Section 3: Confidence in the Operations and Results of the State's System of AMD

Module 2A: State's System of Annual Meaningful Differentiation (AMD)

This webpage is part of the Evaluating State Accountability Systems Under ESEA tool, which is designed to help state educational agency (SEA) staff reflect on how the state's accountability system achieves its intended purposes and build confidence in the state's accountability system design decisions and implementation activities. Please visit the tool landing page to learn more about this tool and how to navigate these modules.

A key part of validating a theory of action is to determine whether evidence confirms the assumptions and links between components that yield intended outcomes. For an SEA, the state's accountability system can be considered a measure that helps the public understand the degree to which schools and districts meet the state's educational objectives and priorities as well as a policy lever to incentivize actions that help achieve those same objectives and priorities.1 If a state can identify sufficient evidence to uphold the assumptions associated with the state's system of AMD, the state may consider the results of the state's system of AMD valid for identifying schools.

SEA staff may use the following reflection prompts to consider whether the evidence generated by the state's system of AMD supports the underlying rationale, and whether the SEA can be sufficiently confident that the state's system of AMD's pieces are working together as intended. Respond to the following prompts to engage in the reflection around the state's system of AMD's operations and results:

  1. Read the claim, consideration, and potential sources of evidence.
  2. Examine the specific evidence available in your state. Reflect on whether you believe you have collected enough evidence to be confident in the claim stated or whether there is a need for further examination.
  3. Finally, respond to questions that pose whether you (a) have sufficiently explored the confidence claims below and (b) believe that you have collected enough evidence that these claims can be confirmed. Some questions may be based on opinion, whereas others will require an examination of data, supplemental analyses, or conversations with other SEA colleagues.

You may print this webpage and use it as a template for note taking if working with colleagues.

For states with non-summative rating systems, please click on the link below to jump to the non-summative rating system reflection prompt section (Table 7).

For summative rating systems (e.g., index-based systems), please see the reflection prompts in Table 6.

Table 6: Confidence in the Operations and Results of the State's System of AMD for Summative Rating Systems

Claim 1: School rankings and groupings created using the state's system of AMD reflect data as intended and expected.
Consideration 1.1: Rankings generated through the state's system of AMD reflect expectations based on design and policy objectives.
Consideration 1.2: Rankings generated through the state's system of AMD reflect expectations based on simulations and historical data.
Consideration 1.3: School ratings align with outcome data exhibited by schools.
Consideration 1.4: Groupings of school ratings align with outcome data (i.e., indicator performance) exhibited by schools.

When compared with prior iterations of the state's accountability systems, the current iteration of the state's accountability system may have similar priorities or be drastically different from systems used in the past. How schools are ranked by overall index scores, for example, or grouped by an overall rating, such as a star rating or letter grade, is a key set of evidence to understanding how the state's system of AMD is functioning when operational.

For each consideration, review the key questions presented, and use the key evidence checks to help answer those questions.

Reflection PromptsNotes

Key questions:

Are schools grouped, ranked, or clustered appropriately?

 

Why is this important?

Examining overall school scores or rating distributions is an important step in determining whether schools receive the ratings you would expect. Examine measures of central tendency to determine the range of school scores.

 

Key evidence checks:

  • Determine the extent to which the variation in overall scores and indicator results is expected.
  • Consider whether measures that comprise indicators have been modified or corrected as needed. Determine whether these transformations are reasonable and promote the appropriate differentiation of schools.
  • Examine school-performance profiles across indicators (if applicable) and consider whether the data are as expected (e.g., similarity or variability across and within schools).
  • Determine whether school-performance profiles based on indicators are reasonable or vary unpredictably.
 

Potential next steps:

Based on empirical analyses, consider the range of overall school scores or ratings. If schools are too tightly clustered, it may be difficult to differentiate schools meaningfully. Determine whether the lack of spread and/or differentiation is due to how indicators are weighted and/or how indicators are transformed. This is addressed in more detail in Modules 3A-3E: Indicators.

 
Reflection PromptsNotes

Key questions:

Have you conducted simulations using prior data to support comparisons with operational data? If so, are the results consistent? Are the results surprising or unexpected? If you have not run simulations, do you have sufficient historical accountability results to support comparisons with operational data?

 

Why is this important?

On average, school performance exhibits some consistency over time to help us identify notable changes in performance. A baseline can be a useful comparison for new systems.

 

Key evidence checks:

  • Compare operational overall results and indicator results to results from simulations. Determine whether comparisons between the two are reasonable.
    • Consider whether changes in indicator-level results have unexpected influence on overall school ratings.
    • Consider whether you need to collect additional data to determine whether trends are similar if there are major changes to measures?
  • If simulation data are not available, compare operational results to historical accountability results.
    • Determine whether similarities or differences are reasonable, based on the design of the system.
  • Identify any outliers and determine whether they are due to idiosyncrasies in the data or if they reflect something more systemic.
 

Potential next steps:

  • Compare operational data with either historical or simulated data to help understand whether there is unexpected or excessive variation in indicator results, overall scores, and summative ratings.
  • Although some variation is expected, schools should not necessarily show large changes over time unless some reasonable explanation exists (e.g., strong intervention, changes in school leadership). Determine whether there are any major changes or issues with data that could cause this volatility, or if you need to more deeply examine how indicators are interacting.
  • Examine trends over time to determine whether current and future operational results are reflecting expected ranges of improvement. Unexpected results may lead to difficult-to-achieve performance expectations and may require additional explanation. It may be important to prepare communication materials to support this.
  • Empirical issues that are difficult to explain may require an examination of how the indicators are combined, which is addressed in Modules 3A-3E: Indicators.
 
Reflection PromptsNotes

Key questions:

What is the relationship between overall school ratings/scores and indicator performance?

 

Why is this important?

The relationship between ratings and indicators can be a function of how indicators are weighted in the system. For example, variations in school ratings will differ when comparing a system that weights student academic growth more heavily with a system that weights student academic achievement more heavily.

 

Key evidence checks:

  • Determine the magnitude and direction of the relationship among indicators and between indicators and school ratings (e.g., correlation).
  • Determine the indicators that drive the most change in overall school ratings, and whether this influence is expected and intended (e.g., through regression analyses, factor analyses).
 

Potential next steps:

  • The magnitude and direction of the relationship between indicators is an important clue into understanding what indicators are driving changes in the overall system. If indicators are too highly correlated, examine the predictive power of individual indicators to determine whether school score variation is inappropriately influenced by a given indicator(s).
  • When comparing the policy weights with actual weights (i.e., the degree to which indicators predict the school ratings), large mismatches may result in misclassifications, misleading ratings, or unintended influence. Examine the magnitude of these differences to help determine whether changes need to be made to weights or business rules in the state's system of AMD.
 
Reflection PromptsNotes

Key questions:

How are schools grouped by rating or overall score? Do these groupings result in commonalities or trends in school indicators?

 

Why is this important?

School groupings might provide insight into how many meaningful groups of schools exist. Although this is decidedly empirically driven, it can help inform our understanding about later examinations of differentiation and how empirical groupings compare with policy-driven categories. Discrepancies are not indicative of problems with state categories but may indicate that groupings are not as empirically different as expected.

 

Key evidence checks:

  • Conduct categorical analyses of schools (e.g., k-means clustering, discriminant analyses) to determine whether school groupings reflect intended school-performance categories.
  • Compare operational groupings with historical or simulated groupings of schools, and determine whether differences are expected based on the number of categories that you detected in prior analyses.
 

Potential next steps:

  • Through categorical analyses, determine whether there are naturally occurring groups of schools across the range of overall school ratings. Naturally occurring groups may not correspond to the various cut scores for the system. This may be a function of performance standards not being set based on common characteristics of schools. However, if there is a strong rationale as to why performance standards are defined independent of data, the lack of correspondence may be expected. Revisit your performance standard-setting process to ensure it is defensible and reflects school-performance expectations as intended in light of operational performance (this is addressed in greater detail under Claim 2 below).
  • If major differences exist across groupings or rankings of schools, confirm that this is reflected in the state's system of AMD rationale. If differences do not reflect the rationale, there may be indicator interactions that are not functioning as planned, which is addressed in Modules 3A-3E: Indicators.
 
Claim 1 Reflection QuestionsClaim 1 Response
Reflecting on your notes above, consider your confidence in responding to the reflection questions below. If you answer "no" or are not confident in your response, consider using Modules 3A-3E: Indicators to explore these topics in more depth.
We have sufficiently explored the confidence claims above to understand how the state's system of AMD is grouping or ranking schools and whether it works as expected.Yes / No
We have collected enough evidence to sufficiently address key questions and can confirm the state's system of AMD reflects expectations based on design and policy objectives.Yes / No
Claim 2: Results from the state's system of AMD reflect meaningful differentiation among schools.
Consideration 2.1: The rating distribution across schools has face validity.
Consideration 2.2: School and indicator scores or results are distributed at intervals that reflect meaningful differences.
Consideration 2.3: The overall rating or indicator results at the lower and higher thresholds of each rating are reasonable and defensible.

A key purpose of the state's system of AMD is grouping schools. This is usually a function of how performance standards are set (i.e., what constitutes an "A" rating, 5 stars, or some other top rating). However, it is important to determine how the standard-setting process impacts operational ratings of schools. We should try to understand both the "average" characteristics of schools that receive particular ratings, as well as the ways in which characteristics differ across ratings.

For each consideration, review the key questions presented, and use the key evidence checks to help answer those questions.

Reflection PromptsNotes

Key questions:

To what extent are schools distributed across the available ratings in your system of state's system of AMD?

 

Why is this important?

Although a relatively straightforward examination, the distribution of schools across the possible ratings is an important piece of evidence to support the face validity of the state's system of AMD. Too many mismatched high- or low-rating schools can lead to misinterpretation by educators and the public.

 

Key evidence checks:

  • Compare the number of schools receiving each rating with policy objectives and the state's system of AMD rationale to determine whether the results are reasonable.
  • Compare the number of schools receiving each rating with any simulations or historical data, and determine whether results match your expectations.
 

Potential next steps:

If performance distributions do not match expectations based on policy objectives and articulated expectations, consider revisiting performance standards to better align the design or clarify why there may be a mismatch between long-term goals under ESEA section 1111 and results.

 
Reflection PromptsNotes

Key questions:

To what extent do differences in school ratings reflect meaningful differences in indicator results?

 

Why is this important?

Although indicator results often serve as a proxy for behavioral characteristics, they are an important window into understanding performance in the present and over time. When exploring whether differences exist as an artifact of policy decisions, data characteristics, or both, various pieces of evidence should be examined.

 

Key evidence checks:

  • Examine the range between school/indicator scores in the middle of the score distribution compared to the lower- and higher-performing schools (i.e., interquartile range vs. the lower and upper quartiles) if the system produces scores.
  • Identify whether schools are "clustering" around a given school/indicator result (e.g., identify any multimodal tendencies in the school distribution).
  • Examine school/indicator scores for distinctness by decile, (i.e., how different are the school and indicator scores for every 10th percentile)?
  • Examine indicator-result variance at or near performance cut-points.
 

Potential next steps:

  • Use the degree to which schools are distributed across the performance distribution to help communicate differences in indicator results associated with the state's system of AMD. Some clustering around the center of the distribution should be expected, and the distribution should spread as you move to the extremes. Too much clustering in the middle could result in drastic changes to school accountability results that may be driven by small changes in indicator results. Excessive changes in a school's overall rating due to small changes in indicator results may necessitate transformation of indicator results to standardize them for better comparison or changes in business rules.
  • If the range of school performance shows a few different common results, determine why schools are clustering around certain results. This could be due to gaps in available score points that stem from gaps in source data, transformations, or lack of variability in indicators. Revisions to indicator selection or changes in data transformation should be balanced with policy objectives or external requirements that dictate the use of indicators in the state's system of AMD.
 
Reflection PromptsNotes

Key questions:

What are the ranges of performance within each school rating for overall scores and for each indicator, and are these ranges reasonable?

 

Why is this important?

The variability for schools receiving a particular rating will likely be greater for indicator performance than for overall performance. Understanding the range and characteristics of performance among schools receiving each rating can help us better understand differences in school performance.

 

Key evidence checks:

  • Examine measures of central tendency for the overall school ratings or, if schools receive scores, the scores (e.g., range, mean, median, mode, shape, standard deviation) and determine how these differ across ratings or scores.
  • Determine if overlaps in data exist near the edge of performance cut scores (e.g., standard deviation of overall school score approaches the range of a school scores).
  • Examine measures of central tendency by school rating for each indicator (e.g., range, mean, median, shape, standard deviation) to determine the level of similarity in indicators.
  • Determine if overlaps in data exist near the edge of performance cut scores by indicator (e.g., standard deviation of an indicator by school rating or indicator category approaches the range of a category).
 

Potential next steps:

  • If empirical characteristics of school ratings cannot be differentiated across ratings, consider revisiting the performance standard-setting process. However, if the key evidence checks are in line with key policy drivers or objectives, ensure that the interpretation of performance expectations and its alignment to results are accessible and defensible through the development of clear communications materials and reporting.
 
Claim 2 Reflection QuestionsClaim 2 Response
Reflecting on your notes above, consider your confidence in responding to the reflection questions below. If you answer "no" or are not confident in your response, consider using Modules 3A-3E: Indicators to explore these topics in more depth.
My state has sufficiently explored the confidence claims above to understand how the state's system of AMD is differentiating schools.Yes / No
We have collected enough evidence to sufficiently address key questions and can confirm that the state's system of AMD is differentiating schools appropriately.Yes / No
Claim 3: Results from the state's system of AMD align with objectives and policies around subgroups and school size/setting demographics as expected.
Consideration 3.1: School-level ratings align with objectives for subgroups.
Consideration 3.2: The results from the state's system of AMD are not overly influenced by school size and the impact of location is understood.

State systems of AMD often prioritize equity or equitable performance as an objective. It is important to understand the degree to which the state's accountability system and system of AMD results are sensitive to issues of equity. This may be explicitly or implicitly defined in identification decisions or decision rules.

For each consideration, review the key questions presented and use the key evidence checks to help answer those questions.

Reflection PromptsNotes

Key questions:

How are objectives for subgroups embedded into performance expectations for indicators, overall school ratings, or long-term goals and measurements of interim progress?

 

Why is this important?

Identifying how subgroup performance, subgroup characteristics, and subgroup expectations relate to the state's system of AMD results is important to understanding how well the rationale behind the state's system of AMD can be confirmed.

 

Key evidence checks:

  • Identify correlations between school/indicator results and key demographic characteristics (e.g., economically disadvantaged, children with disabilities, English learners, children from major racial or ethnic groups, including those that are from historically underperforming student group(s)) to determine the direction and magnitude of these relationships.
  • Examine the relationship between school/indicator results and average number of student groups represented to determine whether there are any systematic issues with the types of schools receiving different ratings. li>
 

Potential next steps:

  • Although relationships between school/indicator results and demographic characteristics are expected, they should not be too strong, suggesting results are being driven too much by demographic characteristics. Conversely, they should not be so low that they appear unrelated. If correlations are too high or too low, revisit the indicators that are included in the system, how those indicators are transformed, or the weighting of indicators.
  • When considering the relationship between school/indicator results and the average number of student groups represented, cross-reference these data by location to determine whether adjustments should be made to indicators. Any revisions would likely prioritize considerations of equity in the system to ensure improvement can be detected for schools regardless of the number of subgroups identified.
  • Substantial differences in schools' scores when the progress in achieving English-language proficiency (ELP) indicator is omitted may highlight that progress in achieving the ELP indicator is inappropriately identifying schools when this student group is present. Consider revisions to how the overall indicator is included in the state's system of AMD (e.g., relative weight, omission rules), how points are earned, or how performance expectations are defined, although keep in mind that the progress in achieving ELP indicator must receive substantial weight individually. This indicator is examined in greater detail in Modules 3A-3E: Indicators.
 
Reflection PromptsNotes

Key questions:

In addition to subgroup characteristics, to what extent are results from the state's system of AMD influenced by school size and location (e.g., rural or isolated schools)?

 

Why is this important?

Although location may be a function of other school characteristics, it is important to identify any cases where systematic trends emerge and whether the influence of school size or location is expected or by design.

 

Key evidence checks:

  • Examine the relationship between school size and average school/indicator result.
  • Examine the relationship between school/indicator results by setting (e.g., district, region, urban, suburban, rural, etc.).
  • Conduct simulations based on varying n-sizes to compare results.
 

Potential next steps:

  • The relationship between school size and average school/indicator result can be cross-referenced to other demographic characteristics to determine the degree to which size is related to other research-based indicators of school performance. If schools of extreme sizes are over- or underrepresented in certain school ratings, consider exploring alternative routes for identification, additional criteria, or changes to the state-defined minimum number of students.
  • The relationship between school/indicator scores and setting (e.g., district, region, urban, suburban, rural) may be a function of other variables that predict indicator results or may simply be related to setting. Examine this to confirm that there are no systematic issues in the types of data that are used in the state's system of AMD.
  • Simulations based on varying state-defined minimum numbers of students—particularly for the progress in achieving the ELP indicator—might impact which schools are identified. Consider how the characteristics of school and indicator results vary by n-size. Different n-size thresholds may support different policy objectives associated with the state's system of AMD. If policy objectives and support capacity do not align with school identification results, revisions may be necessary to subgroup business rules.
 
Claim 3 Reflection QuestionsClaim 3 Response
Reflecting on your notes above, consider your confidence in responding to the reflection questions below. If you answer "no" or are not confident in your response, consider using Modules 3A-3E: Indicators to explore these topics in more depth.
My state has sufficiently explored the confidence claims above to understand how the state's system of AMD system is detecting subgroup or school-size characteristics as intended.Yes / No
We have collected enough evidence to sufficiently address the key questions and can confirm the state's system of AMD is detecting subgroup or school-size characteristics as intended.Yes / No

[If you would like to further explore the state's system of AMD through reflection on identified schools, please click here to continue on to Module 4: Comprehensive Support and Improvement Schools]

[If you would like to further explore the state's system of AMD through reflection on specific indicators, please click here to continue on to Modules 3A-3E: Indicators]

[If you are confident in your results and do not wish to engage in further reflection on the state's system of AMD, click here to continue on Module 6: Reporting]

[Click here to go back to the tool home page]

Table 7: Confidence in the Operations and Results of the State's System of AMD for Non-Summative Rating Systems

Claim 1: School results and groupings created via the state's system of AMD reflect data as intended and expected.
Consideration 1.1: School results from the state's system of AMD reflect expectations based on design and policy objectives.
Consideration 1.2: School results from the state's system of AMD reflect expectations based on simulations and historical data.
Consideration 1.3: School profiles align with outcome data exhibited by schools.
Consideration 1.4: School-rating profiles align with outcome data (i.e., indicator performance) exhibited by schools.

When compared with prior state accountability systems, the current iteration of the state's accountability system may have similar priorities or be drastically different. Understanding how schools are identified is a key piece of evidence to understanding how AMD is functioning when operational.

For each consideration, review the key questions presented and use the key evidence checks to help answer those questions.

Reflection PromptsNotes

Key questions:

Do decision rules for the state's system of AMD result in reasonable distribution of results?

 

Why is this important?

Non-summative systems do not prioritize overall schools' scores, but still provide a wealth of data to the public and educators through the reporting of indicators.

 

Key evidence checks:

  • Determine whether the range across school profiles is reasonable in the state's system of AMD.
  • Determine whether the mean, median, and standard deviation are reasonable based on the number of schools included in the state's accountability system.
  • Examine how much each indicator contributes to the range and variation across school profiles.
  • Examine measures of central tendency for each indicator in the system where decision rules are used to identify schools.
 

Potential next steps:

  • Based on measures of central tendency, determine what indicators can and should be compared without issue (e.g., directly compared indicators have similar ranges and standard deviations). If the characteristics of indicators are too dissimilar, they may not contribute to decision rules equally and may need to be transformed to be more comparable.
  • Verify that transformations (e.g., standardization, composites, indexes) do not negatively impact interpretations for indicators. Transformations can change both the interpretation and impact that indicators can have on decision rules. It may be necessary to adjust transformations to allow for easier interpretation or to influence decision rules more intentionally.
  • • Consider whether common school profiles (based on results of decision rules) can help influence school-improvement decisions.
 
Reflection PromptsNotes

Key questions:

Have you conducted simulations using prior data to support comparisons with operational data once schools are identified? If you have not run simulations, do you have sufficient historical accountability results to support comparisons with operational data?

 

Why is this important?

With non-summative systems, the examination of performance over time is dependent on the indicator performance over time, which requires a baseline.

 

Key evidence checks:

  • Examine operational indicator data and school performance profiles by indicator. Compare these school profiles and indicator data with simulated data, and determine whether results are expected or reasonable.
 

Potential next steps:

  • Compare either historical or simulated data with operational data to help understand whether there is unexpected or excessive variation in school-performance profiles based on decision rules.
  • While some variation is expected, schools should not necessarily show large changes in performance over time unless some reasonable explanation exists (e.g., strong intervention, changes to school leadership). If data show volatility in school performance, determine whether there are any major changes or issues with data that could cause this volatility or if you need to more deeply examine how indicators are interacting.
  • Examine trends over time to determine whether current and future operational results are reflecting expected ranges of improvement. Unexpected results may result in difficult-to-achieve performance expectations and may require additional explanation. It may be important to prepare communications materials to support this.
  • Empirical issues that are difficult to explain may require an examination of how the indicators are combined, which is addressed in Modules 3A-3E: Indicators.
 
Reflection PromptsNotes

Key questions:

How do the indicator results reflect expected and meaningful differences between schools identified for improvement and other schools across the state?

 

Why is this important?

Non-summative systems must still categorize schools identified in need of improvement (i.e., CSI, TSI, ATSI). These concepts are explored more deeply in Modules 3 (CSI) and 4 (TSI/ATSI).

 

Key evidence checks:

  • Determine the relationships among the indicators for those schools identified as CSI, TSI, and ATSI.
  • Determine the degree to which the relationships among school performance profiles differ based on the school's identification (e.g., ATSI and TSI are more variable than CSI).
  • Determine the indicator(s) that have the most influence on identification based on the decision rules used in the state's system of AMD.
 

Potential next steps:

  • Examining the relationships among indicators for identified schools can provide insight into whether there are similarities that are overlooked due to categorical decisions. However, too much similarity may reflect a disconnect between policy objectives and decision rules. If too much similarity exists, consider revising decision rules to better separate the profiles of identified schools.
  • Some decision-rule-based systems leverage a series of grouping steps to identify schools, with the largest influence indicators being used first (e.g., growth and achievement). If one of the indicators in the first- or second-round decisions has too much influence on identification, the decision rules may not reflect the rationale behind the state's system of AMD. Consider revising the order, weighting, or transformation of indicators to better support policy objectives.
 
Reflection PromptsNotes

Key questions:

How well do data from the indicators support the grouping of schools?

 

Why is this important?

In the case of non-summative systems, it may be helpful to study the differences between identified and nonidentified schools. This examination can help provide insights into the discrepancies between policy and empirical observations.

 

Key evidence checks:

  • Conduct categorical analyses of schools using indicator data (e.g., k-means clustering, discriminant analyses) to determine whether school groupings reflect categories of schools required to be identified (i.e., CSI, TSI, ATSI). This may require "dummy coding" identification categories to differentiate among the nonidentified, CSI, TSI, and ATSI schools.
  • Develop school-performance profiles based on indicator data to support comparisons across identified schools.
  • Determine whether empirical data show any meaningful differences between school groups in indicator data, the magnitude of differences or similarities, and whether these similarities or differences were expected.
 

Potential next steps:

  • Examine the degree to which school profiles among school identification categories (i.e., CSI, TSI, ATSI) are similar or dissimilar. What is the range of performance in school profiles by school identification category? If schools are too similar or dissimilarities do not make sense, it will be important to understand how decision rules are applied (e.g., sequence, importance of certain indicators, weights of decisions). Consider revising indicator transformations or decision rules to increase the differentiation in outcome performance.
  • Examine the degree to which school profiles for identified and nonidentified schools are similar or dissimilar. What is the range of performance in school profiles by school type? Unexpected results, too much similarity, or insufficient differentiation may require revising decision rules or performance expectations for cut-points. Consider revising indicator transformations, if applicable, or performance expectations for decision rules to increase the differentiation in outcome performance (this is addressed in greater detail under Claim 2 below).
 
Claim 1 Reflection QuestionsClaim 1 Response
Reflecting on your notes above, consider your confidence in responding to the reflection questions below. If you answer "no" or are not confident in your response, consider using Modules 3A-3E: Indicators to explore these topics in more depth.
We have sufficiently explored the confidence claims above to understand how the state's system of AMD is grouping or ranking schools as expected.Yes / No
We have collected enough evidence to sufficiently address key questions and can confirm the state's system of AMD reflects expectations based on design and policy objectives.Yes / No

 

Claim 2: Results from the state's system of AMD reflect meaningful differentiation among schools.
Consideration 2.1: School profiles have face validity.
Consideration 2.2: School/indicator results are distributed at intervals that reflect meaningful differences.
Consideration 2.3: The overall or indicator results of schools at the lower and higher thresholds of each rating are reasonable and defensible.

A key purpose of the state's system of AMD is determining how to assign accountability ratings to schools. This is usually a function of how performance standards are set. However, it is important to determine how the standard-setting process impacts which schools are identified for CSI, TSI, and ATSI. We should try to understand both the characteristics of schools within categories, as well as the way in which characteristics differ across categories.

For each consideration, review the key questions presented and use the key evidence checks to help answer those questions.

Reflection PromptsNotes

Key questions:

To what extent is the distribution or groupings of schools reasonable when comparing identified and nonidentified schools?

 

Why is this important?

Many of the distinctions between identified and nonidentified schools will be based on the policy objectives; the order of decisions for CSI, TSI and ATSI identification; and the performance expectations for CSI, TSI and ATSI schools (see Module 4: CSI and Module 5: TSI and ATSI for more detail).

 

Key evidence checks:

  • Determine whether the number of CSI, TSI, and ATSI schools reflect expected identification rates based on statutory requirements, policy objectives, and the rationale behind the state's system of AMD.
  • Determine whether the types of schools identified for TSI and ATSI reflect the policy objectives of the state's system of AMD (e.g., identify schools with large gaps).
 

Potential next steps:

  • Understanding the differences in performance between CSI and non-CSI schools is important to the face validity of identification. Address similarities for schools at the threshold of CSI identification in communications and support decisions.
  • Comparing performance profiles between ATSI/TSI and non-ATSI/TSI schools can help demonstrate whether there are reasonable differences between groups. Because these schools are identified using subgroup performance, overall school profiles may be similar. Consider identifying common and outlier comparisons between identified and nonidentified schools to help the public and educators interpret identification results and how results align with the state's system of AMD objectives and rationale.
 
Reflection PromptsNotes

Key questions:

How do the decision rules in your system affect CSI, TSI, and ATSI identification processes?

 

Why is this important?

School groupings in non-summative systems are primarily focused on how schools are identified for CSI, TSI, or ATSI. Understanding the distribution or groupings of schools (Claim 1 above) is key to understanding how data influence identification.

 

Key evidence checks:

The ideas around identification of the required categories of schools using the state's system of AMD are explored more deeply in the following modules:

  • Module 3: CSI Identification
  • Module 4: TSI and ATSI Identification
 

Potential next steps:

  • Consider how decision rules affect the identification of schools and whether school-performance profiles differ meaningfully between CSI and non-CSI schools. Too little differentiation may require revisions to identification criteria, scaffolding of support for non-CSI schools on the threshold of identification, early warning recommendations, or enhanced communication efforts describing identification rationale.
  • Consider how decision rules affect the identification of schools and whether school-performance profiles differ meaningfully between ATSI/TSI and non-ATSI/TSI schools. ATSI or TSI schools may share more similarities in school-performance profiles with unidentified schools than CSI and non-CSI schools. Identify meaningful differences and identify any cases of unexpected similarity between identified ATSI/TSI schools and nonidentified schools. This may be a function of the number of schools identified or the order in which schools are identified. If significant similarity exists, determine whether revisions can be made to the order of TSI and ATSI identification, identification thresholds, or decision rules that determine identification.
 
Reflection PromptsNotes

Key questions:

How much variability is present in school profiles based on indicator data?

 

Why is this important?

Because there will be a large range of performance across indicators, it is important to understand how to interpret the differences among those schools at the edge of performance thresholds for identification.

 

Key evidence checks:

  • Examine measures of central tendency (e.g., range, mean, median, shape, standard deviation) for schools based on indicator data, and determine the similarity of these data across the identified school types.
  • Identify any particular trends in data (beyond subgroup differences) that could be used to describe the characteristics of schools in each category (i.e., CSI, TSI, ATSI).
 

Potential next steps:

  • Comparisons of measures of central tendency for identified and nonidentified schools can help qualify differences in school-performance profiles. If there are strong differences between identified and nonidentified schools (in both schoolwide performance profiles and subgroup-specific performance profiles), this can be used to communicate the defensibility of identification design and decision rules.
  • If it is difficult to identify systematic differences between identified and nonidentified schools, consider whether this supports the intended rationale or policy objectives for the state's system of AMD. There may be intentional reasons for widespread or minimal identification of schools. It will be important to help the public and educators interpret performance on the system of AMD and why certain schools are identified.
 
Claim 2 Reflection QuestionsClaim 2 Response
Reflecting on your notes above, consider your confidence in responding to the reflection questions below. If you answer "no" or are not confident in your response, consider using Modules 3A-3E: Indicators to explore these topics in more depth.
My state has sufficiently explored the confidence claims above to understand how our AMD system is differentiating schools.Yes / No
We have collected enough evidence to sufficiently address key questions, and can confirm the state's system of AMD is differentiating schools appropriately.Yes / No

 

Claim 3: Results from the state's system of AMD align with objectives and policies around subgroups and school size/setting demographics as expected.
Consideration 3.1: School-level profile results align with objectives for subgroups.
Consideration 3.2: School-level profile results are not overly influenced by school size and the impact of location is understood.

State systems of AMD often prioritize equity or equitable performance as a policy objective. It is important to understand the degree to which the results from the state's system of AMD are sensitive to issues of equity. This may be explicitly or implicitly defined in identification decisions or decision rules.

For each consideration, review the key questions presented and use the key evidence checks to help answer those questions.

Reflection PromptsNotes

Key questions:

To what extent are policy objectives for subgroup performance, identification, and improvement are addressed by the state's system of AMD?

 

Why is this important?

The identification of statutorily required school types, particularly TSI and ATSI (which are examined more deeply in Module 4), are particularly susceptible to changes in subgroup characteristics.

 

Key evidence checks:

  • Identify the subgroup characteristics of schools in different categories (e.g., TSI vs. ATSI vs. nonidentified) to identify any trends in data.
  • Examine the relationship between the number of student groups represented and identified schools.
  • Examine the rate of identification for schools based on the progress in achieving English-language progress indicator.
 

Potential next steps:

  • Based on the subgroup characteristics of schools in different categories (e.g., TSI vs. ATSI vs. nonidentified), relationships should be expected given the requirements set forth in ESEA for accountability based on subgroup performance. However, it is important to understand whether the characteristics of data can be used to understand school performance.
  • There may be a relationship between the number of student groups represented in a school and whether a school is identified for TSI or ATSI. Determine whether the number of subgroups does not overly drive TSI or ATSI identification, and whether identification is a function of decision rules (e.g., n-size, referent/comparison group identification) or reflective of actual underperformance. If policy objectives are not met, there may be a need to revise decision rules.
  • Substantial differences in identification rates when the progress in achieving English-language proficiency indicator is omitted may highlight that this indicator is influencing school identification when this student group is present. This may require revisions to how the overall indicator is included in the state's system of AMD (e.g., order of decision rule), how points are earned, or how performance expectations are defined within the indicator. This indicator is examined in greater detail in Modules 3A-3E: Indicators.
 
Reflection PromptsNotes

Key questions:

In addition to subgroup characteristics, to what extent are school identifications (i.e., CSI, TSI, and ATSI) influenced by school size and location?

 

Why is this important?

Although some school characteristics may be related to location, it is important to identify any cases where systematic trends emerge and whether the influence of school size or location is expected or by design.

 

Key evidence checks:

  • Examine the relationship between school size and identification of schools.
  • Conduct simulations based on varying n-sizes.
  • Examine the relationship between indicator results and setting (e.g., district, region, urban, suburban, rural).
 

Potential next steps:

  • The relationship between school size and identification of schools as CSI, TSI, or ATSI can be cross-referenced to other demographic characteristics to determine the degree to which size is related to other indicators in the state's system of AMD. If schools of extreme sizes are over- or underrepresented in certain school identification categories, explore additional or different school quality or student success indicators or changes to n-size thresholds.
  • The relationship between CSI, TSI, and ATSI designations by setting (e.g., district, region, urban, suburban, rural) may be a function of other variables that predict indicator results and may simply be related to setting. Examinations of this can confirm that there are no systematic issues in the types of data used in the state's system of AMD.
  • Simulations based on varying n-sizes should impact the rates of identification due to the detection of different subgroups. Although expected, consider how the characteristics of school and indicator scores vary by n-size. Different n-size thresholds may support different policy objectives associated with the state's system of AMD. If policy objectives and support capacity do not align with identification of schools, revisions may be necessary to subgroup business rules.
 
Claim 3 Reflection QuestionsClaim 3 Response
Reflecting on your notes above, consider your confidence in responding to the reflection questions below. If you answer "no" or are not confident in your response, consider using Modules 3A-3E: Indicators to explore these topics in more depth.
My state has sufficiently explored the confidence claims above to understand how the state's system of AMD is detecting subgroup, school-size, or location characteristics as intended.Yes / No
We have collected enough evidence to sufficiently address the key questions and can confirm the AMD is detecting subgroup or school-size characteristics as intended.Yes / No

[Click here to continue on to Module 2B: Indicator Interaction in the State's System of AMD]

[If you are confident in your results and do not wish to engage in further reflection on the state's system of AMD, click here to continue on Module 6: Reporting]


1 See Accountability Identification is only the Beginning: Monitoring and Evaluating Accountability Results and Implementation (link is external) from the Council of Chief State School Officers for more information. Please note: The inclusion of links to resources and examples do not reflect their importance, nor are they intended to represent or be an endorsement by the U.S. Department of Education (ED) of any views expressed, or materials provided. ED does not control or guarantee the accuracy, relevance, timeliness, or completeness of any outside information included in this document.

Office of Elementary and Secondary Education (OESE)
Page Last Reviewed:
January 14, 2025