Original URL: http://www.nytimes.com/2003/10/15/education/15EDUC.html?ex=1067246043&ei

Trail of Clues Preceded Regents Fiasco
New York Times
October 15, 2003

In recent years, as New York introduced its new exams required for high school graduation, there were many signs of trouble in the state testing program.

In 2001, teachers complained that the scoring on the new Regents biology and earth science tests was too easy. In 2002, they complained that the scoring on the new physics test was too hard. Physics students tend to be the brightest in the state, and typically 15 percent had failed the old state test; 40 percent failed the new physics test.

Superintendents complained that there was no consistency. In Rye, a wealthy suburb, 79 percent of students scored at mastery level (85 percent or higher) on the state English test, 75 percent on the state history test, and 3 percent on physics. Even when many districts refused to give the new physics test and the state superintendents' association wrote colleges, recommending that they ignore the physics results, the state education commissioner, Dr. Richard P. Mills, would not budge.

Dr. Mills, one of the leading advocates of such testing in the nation, kept issuing upbeat news releases, saying his testing program was "statistically sound" and "in accordance with nationally accepted standards."

People who should have known better were fooled. New York's testing program was one of the first approved by the federal government under No Child Left Behind. In the spring, a survey of testing programs by Princeton Review ranked New York first of the 50 states. "Mills was hard core" about testing, said Steven Hodas of Princeton Review. "He had a take-no-prisoners attitude," he said.

No more. In June, it all came crashing down when the scores on the state Math A exam required for graduation were released. Two-thirds of students had failed and the outcry was so great that Dr. Mills had to dump the results and rescale the test. Then came news that a record 47 percent had failed the June 2003 physics test and that test too has to be rescaled.

Now the experts are looking at New York differently. "I'm stunned," said Mr. Hodas. "Frankly, I thought they were professionals." He said Princeton Review was doing a major overhaul of its rating standards. "We're going to have to come up with a fiasco index for a state like New York that messes up a lot of people's lives for no reason," he said.

What went wrong? The answer came last week from a special panel of math and testing experts appointed by Dr. Mills to investigate. The panel's short answer is that New York's testing program did not meet national standards and was not statistically sound.

To develop a test with a dependable pass/fail ratio first requires extensive field testing of sample exams to gauge the difficulty of each question, plus the exam as a whole. And yet, the panel found, the state ignored the most basic testing guidelines. To prepare the math exam properly, 1,500 students should have been sampled in field tests, the panel said; only 250 were.

For field testing to be reliable, the panel noted, it must be conducted under conditions that simulate the stress of actual testing; the state field testing was done with students who knew it did not count and often did not bother to answer hard questions. The result? "New York State can't accurately predict performance on Math A," the report said.

The panel found the test was developed "on the cheap," with inadequate staff. The state has one testing expert with a 30-member staff to oversee 70 tests, said Dr. William J. Brosnan, the panel chairman and Northport schools superintendent. "The staff had to generate one document per person per day," Dr. Brosnan said. "You can't expect people to perform under those conditions."

The panel found so much staff turnover that no one remained from 1998, when the Math A test was created. "No one knew the most basic information about the development of the test," Dr. Brosnan said.

To compensate, the state hired consultants, but the panel found their work questionable. The consultants' report on the field tests should have been submitted three months before the exam to allow for adjustments, Dr. Brosnan said. "To this day we haven't seen the consultants' report," he said. "When we called to ask, we were told by the consultant that the individuals no longer worked there. It raised confidence issues."

Before items were included on the test, they were supposed to be approved by a state panel of teachers, but four items in the June exam were added at the end, without consulting those teachers. "Several of those items were poorly worded, confusing and answered incorrectly by large numbers of students,"  Dr. Brosnan said.

When the high failure rate became public, state officials suggested that it was because many weaker students who had previously failed the math test had taken it again in June 2003. But when the panel compared results from the strongest students taking the test, ninth graders, they found that this group scored 18 points lower than the ninth graders who took the June 2002 Math A test. "We were surprised the fluctuation was that bad," Dr. Brosnan said.

The state also did a poor job informing teachers how to prepare for the test. The panel found that state standards were too vague to be useful. With 103 subjects to be mastered for a 35-item test, Dr. Brosnan said, "teachers felt pressed to go a mile wide and an inch deep." Seeing a trigonometry standard, teachers spent weeks on the subject, Dr. Brosnan said. "And then there was not a single trig question. Teachers felt betrayed and kids were let down." The panel recommended completely rewriting the math standards.

At the Regents board meeting last week, Dr. Mills said most recommendations would be quickly carried out. For the first time, he acknowledged that the physics test needed to be rescaled, but wanted to limit it to future tests. The board disagreed, voting to make the rescaling retroactive to the June 2002 test. "If there's an error," said Lorraine Cortes-Vazquez, a regent, "it's important for children to know, that we, as adults, will admit it and correct it."