The Massachusetts Board of Elementary and Secondary Education

Preliminary Information on Automated Test Scoring

Members of the Board of Elementary and Secondary Education
Jeffrey C. Riley, Commissioner
May 11, 2018

At the May 22 Board meeting, I will present preliminary information on our initial investigations into the use of computers to help score English Language Arts essays on the next-generation MCAS tests. The use of automated scoring is under consideration for several reasons, especially the potential to report test results to students, their parents, and schools more quickly than at present. Over time, there is also the potential to achieve significant cost savings.

Since the beginning of the MCAS program, student responses to multiple-choice questions have been scored electronically, while open-ended questions have been scored by qualified, trained, and monitored human scorers in scoring centers. The scorers read hundreds of student responses and assign scores based on rubrics and student exemplars. This model is still in place for all of our MCAS tests.

A number of states and other testing programs have begun to adopt automated scoring systems, usually called engines, which allow computers to assign scores to essays after being calibrated appropriately for each test item based on thousands of sample responses that have previously been scored by human scorers. Like many other technologies, automated scoring engines have become increasingly reliable and refined, particularly over the last several years. The engines appear to offer advantages for large-scale assessment programs, including the ability to score quickly, to apply the same algorithm consistently, and to employ sophisticated routing techniques. For example, the engines can identify essays that are "difficult" to score — such as responses at the borders of score points and extremely high or low scoring papers — and automatically route those responses to expert human scorers. The engines can also be employed in a number of ways — as a back-up scoring mechanism, as a first score that is reread by a human, or as a score for only certain parts or characteristics of responses (spelling or grammar, for example).

It is critical to ensure that all parts of our testing program are as accurate, cost effective, and beneficial to students, parents, and educators as possible, and that procedures are standardized to minimize variability. Although MCAS scorers undergo rigorous training and monitoring procedures to ensure valid scoring, further research may show that computer-assisted scoring can enhance the reliability of our scoring.

Over the next several years, we will continue to conduct analyses, engage in discussions with our technical advisors, and seek guidance from stakeholders throughout the Commonwealth. I will keep you informed as we move forward with our research into automated scoring and its potential uses for the MCAS program.

At the meeting on May 22, we will share our current scoring procedures, some details about how automated scoring works and how it could be incorporated into those procedures, the results of some initial analyses, and some of the potential risks and challenges we have identified.

Deputy Commissioner Jeff Wulfson and Associate Commissioner Michol Stapel will join us for the discussion and answer any questions you may have.