students classroom photo

Sarah Garland/The Hechinger Report

In Washington, D.C., officials shortened a new teacher evaluation checklist after complaints from teachers and principals that it was too long and time consuming.

In Memphis, Tenn., after a year of piloting new evaluations and a summer of training, some principals and teachers remained confused and overwhelmed.

In Louisiana, one expert warned of lawsuits as the state began to roll out a truncated observation system without first testing it.

But in New Haven, Conn., union officials and reformers alike have praised a collaborative effort to help teachers improve under the city’s new rating system.

As Los Angeles officials and union leaders wrangle over the design of new teacher evaluations due to roll out citywide this year, the experiences of other states and districts offer both inspiration and lessons about what not to do.

“We have learned a lot over the last four years about how to do this effectively and well, and the changes we’ve made are reflective of that,” said Scott Thompson, deputy chief of teacher effectiveness in the D.C. Public Schools, which launched a new evaluation system in 2009.

More frequent and rigorous evaluations are part of a new national push to improve the quality of the teaching force. Two-thirds of states are in the process of adopting new evaluations, and many will include student achievement – usually as measured by standardized tests – along with intensive classroom observations. It’s unclear whether the new evaluations will have the desired effect. Even in places with a few years of experience using new systems, there is not enough data to tell for certain whether student achievement is improving as a result of the evaluations.

But early adopters say they have at least begun to pinpoint what hasn’t worked and what teachers and principals find most useful. In designing the Los Angeles Unified School District’s system, Drew Furedi, director of talent management for the district, said officials have looked at places like Hillsborough, Fla., and Washington, D.C., though “I wouldn’t say there’s any one place we modeled off of.”

Washington, D.C.’s experience may be particularly instructive to districts still in the process of designing evaluation systems. The city’s system has been overhauled twice in response to feedback – and problems.

The number of standards on which teachers are measured during a classroom observation was reduced to 18 because teachers found a checklist of 22 indicators too long and confusing. (Los Angeles plans to adopt a checklist with 61 indicators, though evaluators are supposed to focus on 21 of them.) The number of categories for teachers – ranging from “ineffective” to “highly effective” – increased from four to five in an effort to prevent inflation in the ratings. And teachers who have consistently scored well will no longer be observed as frequently as lower performers to save time and lessen anxiety among teachers.

Tennessee also reduced the observation workload because principals felt overwhelmed.

“It may seem pretty obvious, but I think anybody started down this road will tell you this is a huge shift in the role of the principal,” said Sara Heyburn, an assistant commissioner in the Tennessee Department of Education. “We had to move quickly to train more people, and we allowed people to combine observations.”

One of the biggest shifts in D.C. was the decision last year to reduce the reliance on test scores in favor of other measures of student achievement that teachers will determine with their principals. Before, value-added measures, which calculate expected student growth on standardized tests, counted for 50 percent of a D.C. teacher’s rating. But value-added measures have been widely criticized as unreliable. Going forward, they will count for 35 percent of a teacher’s overall evaluation.

“Student performance will continue to be the largest piece of the pie,” Kaya Henderson, the D.C. schools chancellor, said in a statement when the change was announced in August. But, she said, “we are evolving that approach to now include multiple measures.”

Most systems combine two main factors in measuring a teacher’s performance: a rating based on at least one formal classroom observation and a rating meant to capture how much students learn during the year. Previously, most states called for evaluations that relied on a single observation, and tenured teachers were not observed every year.

In Los Angeles, the teachers union and district officials have reached a tentative agreement to use classroom observations and a mix of data, including raw test scores and whole-school performance, to rate teachers instead of value-added measures.

One of the most vexing problems that many education systems have faced is how to measure student growth, or learning, for the vast majority of teachers who don’t teach in tested subjects or grades.

In Florida, the state is simply developing more standardized tests. In Tennessee in 2011, teachers without individual value-added scores were rated on their school’s overall performance on standardized tests. But many teachers said this was unfair, according to a report by the state education department. So last summer, state officials recommended adding more tests, as long they “benefit student performance.”

Other states have left it to districts or schools to create their own “student learning objectives,” such as portfolios of artwork or improvement in skills like playing scales on a trumpet.

But a pilot in Rhode Island demonstrated that it’s difficult to ensure that the learning objectives are rigorous. “The quality of our student learning objectives was not where we ultimately want them to be,” Rhode Island Education Commissioner Deborah Gist said in an interview with The Hechinger Report in 2011. “There’s no way to make it be entirely objective ever.”

Although hundreds of teachers have lost their jobs due to low ratings as new evaluations have gone into effect, the evaluations haven’t been the shock to the system that many educators expected. In Florida, for example, the percentage of teachers rated poorly rose by 1 percentage point in comparison with the old system, which had been criticized as too lenient. In Tennessee, 2.5 percent of teachers received one of the lowest two ratings (out of five) based on new classroom observations. Three-quarters of teachers fell into the top two categories. And one of the reasons D.C. changed its rating system last year was because the vast majority of teachers continued to be rated as either “effective” or “highly effective.”

“In the end, the anxiety about these systems is largely about the consequences they might carry,” said Timothy Daly, president of TNTP, a nonprofit advocacy group, which in 2009 published a report on teacher effectiveness that helped spur many of the new reforms. “And the truth is that very few teachers are in the position of facing any consequences, which raises the larger question of, ‘Are these ratings accurate?’ ”

At the same time, a nearly universal piece of advice from education officials in other districts and states is to work closely with teachers when designing the new evaluations. Dozens of teachers in New Haven, Conn., have left because they were rated poorly under the new evaluation system. But the union was a partner in developing it, and criticism has been muted compared to elsewhere.

“If you create a system that doesn’t have maximum teacher input, it doesn’t matter how technically sound it is,” said Dan Cruce, a former official in the Delaware Department of Education who now works for the nonprofit policy organization Hope Street Group. “It has to be raised and informed by teacher voices, because that’s who it’s designed for.”

The experiences so far with new evaluations suggest that districts should also expect to make changes as they go along. “The idea is that this is going to continuously improve, just like we expect our educators” to do, said Heyburn, of Tennessee. “You can plan for the hypotheticals, but it’s not till feet hit the ground that you learn the real lessons.”

This story was produced by The Hechinger Report, a nonprofit, nonpartisan education-news outlet based at Teachers College, Columbia University.

Creative Commons License

Republish our articles for free, online or in print, under a Creative Commons license.


Sarah Garland is a staff writer at The Hechinger Report. She has written for publications including the Washington Post, the New York Times, Newsweek, and the American Prospect, and is the author of the book "Gangs in Garden City: How Immigration, Segregation and Youth Violence Are Changing American Suburbs."