OPE: Office of Postsecondary Education
Current Section
Lessons Learned from FIPSE Projects III - June 1996 - Mathematical Association of America

Software for Computer-Generated Math Placement Tests

Purpose

Mathematics departments across the country spend large amounts of time devising tests to ensure the proper placement of students in their courses. They must create multiple versions of a given test so that groups of students taking it at different time s will have different problems to solve. Yet these examinations must parallel each other and the problems must be equivalent in difficulty.

As a service to its members, the Mathematics Association of America (MAA) creates examinations. The Association's Committee on Testing devotes a substantial amount of time to assembling sets of formulas or algorithms from each of which many parallel pr oblems can then be generated. The equivalence of the resulting problems is based on the judgement of the item writers, but cannot be statistically established.

Project staff believed that much of this time would be saved if formulas were written in such a way that the problems could be generated by computer. Not only could the tests be assembled more readily, but faculty at individual institutions would, if t hey desired, be able to create their own tests from the computer's item bank. Furthermore, the parallelism of the items and the equivalence of the tests could be statistically established.

Innovative Features

Through this project the MAA created computer software that produces statistically parallel tests in arithmetic, two levels of algebra, trigonometry, and calculus readiness. (Statistical parallelism means no more than a one percent difference in result s among multiple versions of the same test, i.e., a test made up of items generated from the same set of computer algorithms.) The group did not have quite the same success with the equivalence of individual items, though in most cases only one of the it ems generated lay outside the range of equivalence, and could be eliminated.With the addition of a feature that produces a balanced key (i.e., not too many correct answers in a row that correspond to the same letter on multiple choice items, and a balance within the examination among letters that correspond to the correct answer), tests designed to the specifications of individual users can be assembled in 5-15 minutes. Examinations with a large number of graphic items take somewhat longer. The multiple choice items that make up these examinations have an appropriate range of distractors and are graphically clear and attractive.

At present, the test must be delivered already printed up. However, project staff hopes to make possible computer delivery of the examinations to individual student test takers, with immediate feedback on test results. Staff are also at work on incorp orating calculator-based items into tests.

Evaluation

The items were field-tested on a student samples ranging from 264 to 584 for each type of test to determine test and item equivalence. For four of the five tests, the four different versions were equivalent within a range of three to four percent on a 100 point scale. Work will continue to reduce the seven percent variance on the fifth test.

The spread on individual items was much greater, sometimes as large as 40 percent. These variances were greater for the smaller sample sizes, suggesting that a larger sample size might reduce the range of variance.

Project Impact

The increased ease of test creation has allowed a substantial expansion of services, which now run in the range of 400 to 500 institutions. These users have received versions of the same examination in whose statistically established comparability they can have confidence.

Lessons Learned

The wider variance in individual items generated from the same algorithm was not foreseen. Sample size may account for some of the variance, but the differences also raise interesting questions about whether students regard as comparable the same thing s that mathematicians do. Nevertheless, the project demonstrates that useful tests with multiple versions yielding comparable results can indeed be generated by computer. These tests can be created very quickly and made graphically attractive.

Project Continuation

Further work on the test items has improved the level of comparability for different versions of both the same test and individual items. Efforts continue to create calculator-based items that can be incorporated into the individual tests.

Available Information

The project's final report and/or the placement brochure describing the MAA placement Test Program may be obtained from:

Linda H. Boyd
Mathematics Department
DeKalb College
555 North Indian Creek Drive
Clarkston, GA 30021
404-299-4167

[University of Wisconsin at Madison] [Table of Contents] [VIII. Assessment]

Top

FIPSE Home


 
Print this page Printable view Send this page Share this page
Last Modified: 03/16/2007