A r c h i v e d  I n f o r m a t i o n

Reading Excellence Act State Competitive Grant Program: Non-Regulatory Guidance for State Applicants – March 9, 1999


Appendix A. Continuum of Evidence of Effectiveness

The table following provides a framework for assessing whether a program is effective. This table can be used to evaluate alternative research based reading programs. By asking sample questions about each program, the programs can be categorized from "most rigorous" to "marginal. "The most effective programs would be those falling into the "rigorous" category on three or four criteria.

Following the table are two examples applying the framework's criteria to reading instruction and to teacher training in reading. We've shown a "good" example with strong evidence of effectiveness and a "poor" one for each topic.

Table A1. Continuum of Effectiveness

Evidence

Criteria: Most Rigorous

Criteria: Somewhat Rigorous

Criteria: Marginal

Theory/research foundation

Does the program explain the theory behind its design, including references to the scientific literature, that elucidate why the program improves students? ability to read?

Does the program state the theory behind its design explaining how the program's components reinforce one another to improve students? ability to read?

Does the program explain the theory behind its design?

Evaluation-based evidence of effectiveness

Have student reading gains been shown using experimental and control groups created through large-scale random assignment or carefully matched comparison groups?

Have student reading gains been shown using between or within-school comparisons?

Have student reading gains been shown for a single school?

Has the program produced educationally significant pre and post intervention student reading gains as reliably measured using appropriate assessments?

Has the program produced student reading gains relative to district means or other comparison groups using appropriate assessment instruments?

Has the program produced improvements on other indicators of student reading achievement , e.g. amount of time students spend reading outside of school or student engagement?

Have the student reading achievement gains been sustained for three or more years?

Have the student reading achievement gains been sustained for one or two years?

Have other indicators of improved student reading performance been sustained for one or two years?

Have the student reading gains been confirmed through independent, third-party evaluation?

Has the program been evaluated by a state, district, or school evaluation team?

Has the program been evaluated by its developers?

Implementation

Has the program been fully implemented in multiple sites for more than 3 years?

Has the program been fully implemented in the original site(s) for more than three years?

Has the program been fully implemented in the original pilot site(s) for a minimum of one school year?

Is documentation available that clearly specifies the program's implementation requirements and procedures, including staff development, curriculum, instructional methods, materials, assessments, and costs?

Is documentation available that attempts to describe the implementation requirements of the program including staff development, curriculum, instruction methods, materials, and assessments?

Is documentation available that provides a general description of the program's requirements?

Are the costs of full implementation clearly specified, including whether or not the costs of materials, staff development, additional personnel etc. are included in the program's purchase price?

Have the costs of full implementation been estimated, including whether or not the costs of materials, staff development, additional personnel, etc. are included in the program's purchase price?

Is documentation available that provides general information about the program's costs?

Has the program been implemented in schools with characteristics similar to the target school: same grade levels, similar size, similar poverty levels, similar student demographics such as racial, ethnic, and language minority composition?

Has the program been successfully implemented in at least one school with characteristics similar to the target school?

Is information on grade level, size, student demographics, poverty level, and racial, ethnic and language minority concentration available for the schools where the program has been implemented?

Replicability

Has the program been replicated successfully in a wide range of schools and districts, e.g. urban, rural, suburban?

Has the program been replicated in a number of schools or districts representing diverse settings?

Is full replication of the program being initiated in several schools?

 

Have the replication sites have been evaluated, demonstrating significant student reading gains comparable to those achieved in the pilot site(s)?

Have some replication sites been evaluated, demonstrating positive gains in student reading ability?

Are promising initial results available from the replication sites?

 

Example 1: Applying the Criteria to Reading Instruction

Example 1A. Application of effectiveness criteria for a reading instruction example (Good evidence of effectiveness)

The "1A" Reading Program: Teachers at the Libby School observed that the school?s early elementary students continued to have difficulty learning to read, and that by the end of third grade the majority were reading well below grade level. After reviewing research on effective early reading programs, they selected a comprehensive instructional program, A to Z Reading, that combines early direct instruction in phonemic awareness and systematic phonics with instruction designed to enhance students? reading comprehension.

The program includes comprehensive teacher guides, as well as the option of on-site professional development activities. All of Libby School?s classroom teachers completed a one week training program during the summer before they began using instructional program in their classroom. Most teachers reported spending another week reviewing materials on their own, something which was recommended by the publisher. The publisher strongly recommends that its materials be supplemented with a rich variety of reading materials and that students have ready access to books for out-of-school reading. Because funds were limited, the school staff worked with local businesses to develop a "Books for Kids" campaign, and were able to assemble a sizeable school library. Books are placed within classroom—each classroom now has a colorful "Book Nook" where students can sign out books. Teachers work with students, using a variety of strategies developed at the school to encourage reading, to encourage students to read a wide variety of materials. The publisher provides information on effectiveness of its instructional materials, with data available from about 200 districts around the nation.

The Libby School modification—the A to Z Reading program coupled with the "Books for Kids" program—is now in its sixth year, and as of last fall, 20 schools in other districts in the state (including one in a poor neighborhood in the state?s largest city) and 4 in other states have implemented the combined program. These schools have implemented the program for between one and four years.

The state education agency, upon noticing that the program appeared to be effective, assisted the school district in conducting an evaluation of the program. The state helped pay for an external evaluation, which was conducted at 8 schools, four in rural areas, two in poor neighborhoods in a large city, and two in suburban schools. The evaluation also examined the progress of similar students in 10 schools without the program. (Descriptions of the reading programs in the 10 "comparison" school are provided.) The evaluation included scores on the state reading assessment, structured teacher observations, and measures of the number of books students read over the year. Achievement information is available over 5 years at the Libby School, and for 2 to 3 years in the other schools. The results showed that students in the "Libby School" program outscored other students on all measures, and that the results were educationally significant.

Criteria

Evidence

Why?

A. Theory

Rigorous

  • The program has a strong theoretical base and is based on findings from research.

B. Evaluation data

Rigorous

  • The program has been evaluated using both test scores and structured observations of student behaviors. Evaluation data were available for multiple years, with students followed for between 2 and 5 years. Results were educationally significant.

C. Implementation

Somewhat rigorous

  • The program has been running for over 5 years in the original school and up to 4 years in other schools. Information is provided on teacher training activities and on-going "additional" teacher activities, but no cost estimates are given. Some information is provided on the demographic characteristics of the students in the study schools.

D Replicability

Rigorous

  • The program has been successfully implemented in a variety of schools, with gains similar to those in the original site.

 

Example 1B. Application of effectiveness criteria for a reading instruction example (Some evidence of effectiveness)

The "1B" Reading Program: A school is considering a reading program designed to improve students? reading skills. Staff at P.S. 102, a large (1,000 student) school in an urban area, developed the program. The teachers and principal developed the program after observing that the children at the school spent little time outside of school reading, and that out-of-school reading would help support classroom instruction. They further observed that most of these students, the majority of whom were eligible for free or reduced price lunch, had limited access to books. Most families are unable to purchase many reading materials for the students, and because of transportation problems most do not have access to the local public library.

The staff spent time learning what other schools had done to improve student reading skills, and reviewed research on effective instructional strategies. They began by selecting a well-regarded beginning reading series that contains curricular materials, daily lessons, and teachers? guides, and the teachers attended a summer institute sponsored by the developer. In addition, the school staff examined studies on students? motivation to read, and decided to make a major effort to build classroom libraries.

Because funds were limited, the school staff worked with local business and churches to develop a "Books for Kids" campaign, and were able to assemble a sizeable school library. Books are placed within classroom—each classroom now has a colorful "Book Nook" where students can sign out books. Program staff report that they spend approximately 100 hours a year soliciting books, and about 400 hours a year reviewing the books for suitability and maintaining the "Book Nooks." The "Book Nooks" themselves were constructed by parent volunteers with donated materials.

Teachers have noted increased enthusiasm for reading, and (based on counts from the sign out sheets) estimate that each child reads, on average, two books a week. Teachers report that the students have improved their reading skills and, after three years of the program, most of the students are now working close to grade level in reading. Teachers maintain checklists of student reading skills and track progress over time. In addition, student test scores have continued to rise, as measured by the XYZ Test of Reading Comprehension. Three schools in other districts in the state have implemented similar programs, and report promising results after two years (as measured by structured observation and state reading assessments).

Criteria

Evidence

Why?

A. Theory

Somewhat rigorous

  • The program is based on a review of literature on reading, and program staff explain the reason, based on that review, for the program components.

B. Evaluation data

Marginal

  • This program is assessing student reading skills by a combination of structured teacher observations and a reading test, over two years. Data analyses consist mainly of longitudinal measurements, with no information provided on whether the student population changed over time, what proportion of the students were tested, and whether different categories of students (i.e., both boys and girls) benefited from the program.
  • Results are available for only one school, and there has been no independent evaluation of the program standards.
  • No information is provided to show how the program staff ruled out alternative explanations for the changes observed.
  • The project staff conducted all evaluations, and none were reviewed and approved by a panel of independent experts.

C. Implementation

Somewhat rigorous

  • The program has been in place for one year. Limited information is provided on project costs.

D Replicability

Somewhat rigorous

  • The program has been implemented in three other schools with similar findings.

 

Example 1C. Application of effectiveness criteria for a reading instruction example (Poor evidence of effectiveness)

The "1C" Reading Program: A school is considering a reading program designed to improve students? motivation to read. The program was developed by staff at the Green School, a small (250 student) school in a rural area. The program was developed after teachers and the principal at the school observed that the children at the school spent little time outside of school reading. They further observed that most of these students, the majority of whom were eligible for free or reduced price lunch, had limited access to books. Most families are unable to purchase many reading materials for the students, and the small local library is unable to meet community needs.

The school staff worked with local businesses to develop a "Books for Kids" campaign, and were able to assemble a sizeable school library. Books are placed within classroom—each classroom now has a colorful "Book Nook" where students can sign out books. Staff believe that children, if placed in a literature-rich environment, will be able to overcome their reading difficulties.

Teachers have noted increased enthusiasm for reading and (based on counts from the sign out sheets) estimate that each child reads, on average, two books a week. Teachers report that the students have improved their reading skills and, after one year of the program, are now working close to grade level in reading.

Criteria

Evidence

Why?

A. Theory

Marginal

  • The program does not explain the theory behind the program. The program does not build in current research on effective reading practice, including research on the effectiveness of specific instruction in phonemic awareness, phonics, fluency, and comprehension.

B. Evaluation data

Marginal

  • The only measures of student reading ability are observations by the project staff who developed the program. Anecdotal descriptions of student improvement are not adequate data.
  • The program itself would not meet the definition of scientifically based research provided in the LEA: there are no data analyses that are adequate to test the stated hypotheses and justify the general conclusions drawn; the conclusions do not include multiple measurements and evaluators; and the results have not been reviewed and approved by a panel of independent experts.

C. Implementation

Marginal

  • The program has been in place for only one year and only in one school. No information is provided on project costs, although they appear to be minimal.

D Replicability

Marginal

  • The program has been implemented in only one school.

Example 2: Applying the Criteria to Professional Development for Reading Instruction

Example 2A. Application of effectiveness criteria for a teacher training example (Good evidence of effectiveness)

The "2A" Teacher Training Program provides professional development for first and second grade teachers in reading. The goal of the course is to improve teachers' knowledge and skills to teach reading. Initially, teachers take a 3-day course to instruct them in key concepts and methods. One-hour follow-up sessions are held at one-month intervals along with classroom visits by a master teacher experienced in the training's content and practice. All teachers in the schools meet by grade level several times a month to discuss how they are implementing the training program.

The training content is centered around findings from scientifically based reading research, including developing understanding on how children become literate, the relationship between early literacy behavior and later reading, the alphabetic principle, reading comprehension strategies, assessing children's progress, and other content from the reading research base.

The course has excellent information for a training program. The original training program was developed by three local university professors who tested it on teachers in three large schools who were randomly assigned to this training, a much shorter version with only one follow-up session, or no training. The evaluation results found this approach to have statistically significant results. Since then, the professors have evaluated the training as they implemented it in new schools, by comparing the training for the selected schools with close-by, comparable schools that offered only standard district training. In five out of six cases, the training showed much stronger effects.

The training program has good information on what kinds of schools and grade levels it has been implemented in, all of which are similar to the proposed school considering it. Information is available on costs, follow-up requirements, and materials.

The initial evaluation consisted of (1) pre-post questionnaires assessing teachers' knowledge before training started, immediately after the training, and six months later; (2) observation by graduate students of a sample of the teachers as they taught reading during the year; (3) questionnaires filled out by teachers several times during the year describing practice, peer interactions, barriers encountered to implementation, and need for additional support or assistance, filled out several times during the year, (4) measures of student reading achievement, and (5) a "customer satisfaction" survey. Later evaluations included all but the graduate student observations. A new professor is planning to videotape selected teachers over the next year.

Criteria

Evidence

Why?

A. Theory

Most rigorous

  • Context of the course is fully consistent with current reading research theory.
  • The training is consistent with current theory on effective professional development for teachers. It was sustained (more than 36 hours over the year), involved direct teaching of content, practical application, opportunities for peer collaboration, and supervised practice.

B. Evaluation data

Most rigorous

  • The original training was tested using experimental design (random assignment of teachers to sustained training, short training, and no training). This checks whether the teachers might have been able to achieve similar results with less costly or even no training. The replication in six more schools, with positive results adds credibility to the claims for the training's effectiveness—even though the replication used a weaker design,
  • Use of multiple measures to check effectiveness, including collection of baseline data on knowledge, data on immediate perceptions of the training, observations to see if classroom practice changed along with teacher-reported data as well, and information on implementation by teachers and the effects of the follow-up and peer interactions. The most important effect is student learning gains, although it is also important that teachers like the training very much.

C. Implementation

Most rigorous

  • Good information is available on how the program was implemented in the prior schools.

D Replicability

Somewhat rigorous

  • Not bad. The training program has now been implemented in 9 schools, with positive, statistically significant effects on students and teacher behavior as well as enthusiastic acceptance by teachers.

 

Example 2B. Application of effectiveness criteria for a teacher training example (Poor evidence of effectiveness)

The "2B" Teacher Training Program provides professional development for first and second grade teachers in reading. The goal of the course is to improve teachers' knowledge and skills to teach reading. It is a one-day course taught by a local university professor who had developed it in the prior year and provided it to teachers in two schools. It claims to provide highlights of the research base for teaching reading, including some phonics. The course reassures teachers that they do not need to cover phoneme awareness or phonics other than to illustrate them as an adjunct to reading rich text. Children are to be encouraged to try and figure out words they don't know from context. Teachers who have children who are having trouble learning to read are told to refer those children to Title I or special education.

The program was implemented in two schools, with all first and second grade teachers participating. The types of schools weren't identified in the report. The evaluation consisted of a two-page questionnaire filled out at the end of the day by participants. Evaluation results showed that 95 percent of the teachers participating felt that the training program provided excellent information that they could use in their classrooms.

Criteria

Evidence

Why?

A. Theory

Marginal

  • Context of the course is not consistent with current reading research theory. For example, current research shows that guessing unfamiliar words from their context (as a primary strategy) is generally not effective for children who have difficulty learning to read.
  • The training is for only one day. Current theory on effective professional development for teachers supports sustained training with direct teaching of content, practical application, opportunities for peer collaboration, and supervised practice. For reading, the training may need to be more than 30 hours of training, plus follow-up.

B. Evaluation data

Marginal

  • Teachers (and others) participating in training programs traditionally have evaluated their training very highly, regardless of actual effects or utility of the training. While customer satisfaction is an important measure, it is should not be the primary measure. Increases in teacher knowledge and skills in teaching reading, changes in classroom practice, and—the gold standard—improvements in student achievement are the most important measures.
  • Since the training program didn't check on teachers' knowledge at entry to the program, there was no way to judge whether the teachers already knew a lot about the content and to what extent they increased their knowledge.
  • There was no control or comparison group to test whether teachers who didn't take the training knew as much as the trained teachers, changed their classroom instruction too, and achieved student learning gains.

C. Implementation

Marginal

  • The program was only offered in two schools, was small scale with few likely effects. No documentation is available on the processes being used.

D Replicability

Marginal

  • Unknown. Two schools do not constitute replication.

-###-
[Section H. Coordination with Other Programs] [Table of Contents] [Appendix B. State Application]