Leadership •

Why Can't We Get Teacher Evals Right?

BY Erin Werra

Erin Werra

IN THIS ARTICLE

SHARE THIS STORY:

Teacher evaluation is rooted in Revolutionary-era schoolhouses, an era when teachers were considered community servants whose methods were decided by popular demand.

The methodologies for evaluating teacher effectiveness have varied significantly in the centuries since, but the past decade has brought the issue to the forefront, as federal policies have forced state departments and school administrators to review and revise their evaluation processes. Most recently, ESSA put the onus back on states to devise their own teacher evaluation programs, but with no clear "best way to do it," we expect the results to be all over the board.

Thankfully, states have an abundance of recent and relevant research to draw from. Let's take a look at a handful of the most prominent studies from this decade and see if we can't peel some of the layers back to reveal common themes and takeaways.

The research

Here are the reports discussed in this analysis:

The Coverage of Classroom Management in Teacher Evaluation Rubrics: Published in June of 2018, the authors of this study dissected state teacher evaluation rubrics with an eye to classroom management topics. Its authors are Allison Gilmour, Caitlyn Majeika, Amanda Shaeffer, and Joseph Wehby.
Revisiting the Widget Effect: Teacher Evaluation Reforms and the Distribution of Teacher Effectiveness: Allison Gilmour and Matthew Kraft published this study in July 2016, uncovering what the authors call “the Widget Effect,” which reduces teacher performance down into interchangeable parts rather than distinct professionals. Gilmour and Kraft interviewed principals and surveyed evaluators.
How Does Value Added Compare to Student Growth Percentiles?: Elias Walsh and Eric Isenberg completed this study in 2014, which compared teachers’ evaluation results using popular "value-added" formulas.
Improving Teaching Effectiveness: Final Report: This RAND / American Institutes for Research study, designed and funded by the Bill & Melinda Gates Foundation, was created by a team of authors (Stecher, et. al.) who monitored several schools between 2009 and 2016. The final report was published in 2018.

While these five studies shared the common theme of teacher evaluation, each approached the topic from a unique angle. By cross-referencign the results, we hope to identify common conclusions and opportunities for improvement.

Principals are stretched too thin

Researchers noted the difficulties inherent in expecting a single principal to give thoughtful, content-rich feedback to dozens of teachers (a gap that only widens in large schools). One study found only 3% of a typical principal’s work week was spent engaged in teacher evaluation (Hallinger, et. al., 2013).

In a completely different study, 14 of the 24 principals who agreed to 45–60-minute interviews with researchers Kraft and Gilmour cited lack of time as the most frequent reason for avoiding rating teachers below proficient (Kraft & Gilmour, 2016). Principals lacked the time to devote to creating documentation and prescribing professional growth plans. According to one study’s conclusion, administrators found evaluation to be “difficult at best and counterproductive at worst (Hallinger, et. al., 2013).”

There simply aren’t enough hours in a day for principals to meet evaluation needs and complete their own duties. The principals interviewed described a triage method in which only the very lowest performers were rated below proficient, since the workload, paperwork, and time requirements increased exponentially with each teacher rated below proficient (Kraft & Gilmour, 2016).

This burdensome red tape, coupled with the looming threat of union pushback and regional shortages of qualified candidates, can make it exceedingly difficult to replace the lowest performers with those who might have a stronger impact on their students. In the Gates Foundation study, only 1% of low-performing teachers were dismissed in the final year of the study. Districts even lowered expectations so fewer teachers would receive low marks leading to dismissal (Stecher, et. al., 2018).

Differing methodologies

Observation rubrics are quite popular with teachers and principals alike. Researchers have also identified methods for improvement. Particularly regarding classroom management, rubrics aren’t effective without explicit description of standards expected (Gilmour, et. al., 2018).

The lack of consistent, individualized feedback may contribute to the skepticism teachers felt about evaluation. Between 2009 and 2016, the Gates Foundation provided grants and guidelines for what seemed like an effective evaluation plan—including the suggestion of developing career ladders for teachers. None of the sites studied actually created them, but some introduced related programs intended to increase leadership opportunities (Stecher, et. al., 2018).

Certain methods of evaluation attempt to measure the impact of a single teacher on a student in a given year by applying complex formulas. The formula varies depending on the model chosen—two examples include the value-added model and the Colorado Growth Model which uses student growth percentiles to determine how each teacher has aided in students’ growth over time (Walsh & Isenberg, 2014). These models take into consideration pretest score, grade level, and certain education status (free and reduced lunch, English language learners, special education, and attendance record), compensating for factors beyond a teacher’s control (Walsh & Isenberg, 2014).

However, there are still complex challenges to consider when implementing a trustworthy value-added model. For example, secondary students typically have several teachers in a given day. Unique scheduling models may lead to an imbalance in the number of students each individual teacher is assigned. Finally, switching between value-added models resulted in 14.2% of teachers changing effectiveness ratings—in some cases, these distorted results could have consequences, including pay cuts and even dismissal (Walsh & Isenberg, 2014).

What we know for sure

Districts are already working on a lot of these components. The next step is combining them into a fair, effective system. For example, researcher and analyst Matthew Kraft predicts the strategy of creating or adopting rubrics for classroom evaluations will become a standard practice in evaluation. Sharing the same rubric during recruitment helps everyone start off on the same page.

We know teachers want feedback and have a desire to improve their craft, both for their own edification and to make a larger impact on student outcomes. They could benefit from leadership experience—and principals could use all the help they can get from experienced mentors. Creating opportunities for self-study may help fill the professional development gap without adding tremendous budget expenditures.

A high-level analysis of teacher evaluation research reveals some important takeaways:

Teachers could benefit from a more consistent, more effective coaching environment.
Administrators typically lack the time, the tools, and the skillsets to create such an environment.
Because teacher quality has such a strong correlation with academic outcomes, evaluation may well be one of the most important education issues of our lifetime.

Any kind of lasting, sustainable change in the evaluations requires everyone to work together. What role can you play?

Follow-up Resource: The Elusive Coaching Culture

Learn to how to foster a culture of continuous improvement in which everyone feels free to grow and thrive on constructive feedback.

WHAT'S NEXT FOR YOUR EDTECH?

The right combo of tools & support retains staff and serves students better.

We'd love to help. Visit skyward.com/get-started to learn more.

SHARE THIS STORY:

ABOUT THE AUTHOR:

Erin Werra
Blogger, Researcher, and Edvocate

Erin Werra is a content writer and strategist at Skyward’s Advancing K12 blog. Her writing about K12 edtech, data, security, social-emotional learning, and leadership has appeared in THE Journal, District Administration, eSchool News, and more. She enjoys puzzling over details to make K12 edtech info accessible for all. Outside of edtech, she’s waxing poetic about motherhood, personality traits, and self-growth.