Sunday, January 10, 2016

Schools are not all alike -- but children need the same things

Schools are not all alike. It's been pointed out many times that many of those who advocate so-called "no excuses" schools, with rigid rules and harsh discipline, for some children send their own children to a very different kind of school, generally with small class sizes and a curriculum emphasizing hands-on, project-based, engaged learning with opportunities for student choice.  Schools and districts also differ significantly in available resources and thus in class size, teacher quality, and opportunities for students.  Parallel to these differences, of course, and enhancing/exacerbating these differences, is the difference in student background and economic advantage in more affluent districts compared to the student background and economic disadvantage in less affluent districts.

A telling example of these differences comes from my own experience in an affluent suburban district.  Several years ago, the state decided to house a group of homeless mothers and children in a local motel in our community.  We welcomed the children and did our best to integrate them into our school.  They were uniformly astonished at our clean, spacious school and surprised by the field trips and other opportunities.  One boy, who was 15 years old and in the 8th grade, had significant behavioral issues and was "reading" at a pre-first grade, "beginning phonemic awareness," level.  He was unable to participate in our regular 8th grade program, even with considerable assistance, so we hired one of our substitute teachers to work 1:1 with him, and designed a half-day schedule for him focused on reading, writing, and math, and providing for many basketball-playing breaks as rewards for completing work.  After six months in our school, he was able to read and write at approximately a beginning second grade level.  Unfortunately, at that point, the state pulled all the families from that motel and we lost track of him. Many years later, his name appeared in a local paper, unfortunately in connection with criminal acts.  I can't help but think that his story could have been different had he been able to remain in a school with sufficient resources to make a difference for him.

Think about this boy in a school without the resources to provide 1:1 help and a specially designed schedule -- imagine him in a large class of students, some of whom also have learning problems and behavior issues.  Then think about the teacher of that class and what he needs to help him give the children in the class the help they need.  What immediately comes to mind?  Will it help him the most to require annual standardized testing, with the scores published in the news media, so that that he can see exactly how badly his students are performing?  And then to evaluate him based on his students' scores?  The theory of evaluating teachers and schools based on student test scores seems to be that by making achievement differences obvious communities will be forced to provide sufficient resources to alleviate the disparity.  In reality, all it does is create a culture of blame, and motivate teachers who can to move to districts with sufficient resources, thus exacerbating differences in teacher quality between more affluent and less affluent districts.

This boy, along with all children, needed attention, caring, and help -- and because our district had the resources we could provide those for him.  In too many others, the resources are not available, and labeling those schools and those teachers as substandard does nothing to fix the problem.

An excellent article in yesterday's New York Times Sunday Review, by David Kirp, gives an outstanding comparison.  The article is entitled "How to Fix the Country's Failing Schools. And How Not To," and compares the results of the top-down, "corporate reform" style approach in Newark to the "home-grown gradualism" approach in Union City.  The Union City approach, which appears to me to be based on educators working together, looking at the children's needs, and developing strategies to meet those needs, has been successful.  As noted in a wonderful Education Week article by Joanne Yatvin ("Catchers in the Rye," September 14, 1994):
"Where schools are failing, it is not because they don't have enough projects and programs, but because they have lost the human touch.  Children mired in the morass of family and community decay can't benefit from red ribbons, higher standards, or instructional technology; they need caring adults to pull them out of the much and set them on solid ground -- one at a time. . ."
All children need attention, caring, and help in their growth, but all schools are not alike -- some have the resources and some do not.  Children with fewer family resources and more challenges need more support, not less, and schools with children with more needs need resources and help rather than blaming.

Tuesday, January 5, 2016

“Value-Added” Measurement Has Little Value: Using These Numbers Negatively Impacts Real People in Real Schools

At the end of the last school year, I was chatting with two excellent teachers, and our conversation turned to the new state-mandated teacher evaluation system and its use of student “growth scores” (“Student Growth Percentiles” or “SGPs” in Massachusetts)  to measure a teacher’s “impact on student learning.”
“Guess we didn’t have much of an impact this year,” said one teacher.
The other teacher added, “It makes you feel about this high,” showing a tiny space between her thumb and forefinger.
Throughout the school, comments were similar -- indicating that a major “impact” of the new evaluation system is demoralizing and discouraging teachers. (How do I know, by the way, that these two teachers are excellent?  I know because I worked with them as their principal – being in their classrooms, observing and offering feedback, talking to parents and students, and reviewing products demonstrating their students’ learning – all valuable ways of assessing a teacher’s “impact”.)

According to the Massachusetts Department of Elementary and Secondary Education (“DESE”), the new evaluation system’s goals include promoting the “growth and development of leaders and teachers,” and recognizing “excellence in teaching and leading.” The DESE website indicates that the DESE considers a teacher’s median SGP as an appropriate measure of that teacher’s “impact on student learning”:
“ESE has confidence that SGPs are a high quality measure of student growth. While the precision of a median SGP decreases with fewer students, median SGP based on 8-19 students still provides quality information that can be included in making a determination of an educator’s impact on students.”
Given the many concerns about the use of “value-added measurement” tools (such as SGPs) in teacher evaluation, this confidence is difficult to understand, particularly as applied to real teachers in real schools.  Considerable research notes the imprecision and variability of these measures as applied to the evaluation of individual teachers.  On the other side, experts argue that use of an “imperfect measure” is better than past evaluation methods.  Theories aside, I believe that the actual impact of this “measure” on real people in real schools is important.

As a principal, when I first heard of SGPs I was curious.  I wondered whether the data would actually filter out other factors affecting student performance, such as learning disabilities, English language proficiency, or behavioral challenges, and I wondered if the data would give me additional information useful in evaluating teachers.  
Unfortunately, I found that SGPs did not provide useful information about student growth or learning, and median SGPs were inconsistent and not correlated with teaching skill, at least for the teachers with whom I was working. In two consecutive years of SGP data from our Massachusetts elementary school:
Ø  One 4th grade teacher had median SGPs of 37 (ELA) and 36 (math) in one year, and 61.5 and 79 the next year.  The first year’s class included students with disabilities and the next year’s did not.
Ø  Two 4th grade teachers who co-teach their combined classes (teaching together, all students, all subjects) had widely differing median SGPs: one teacher had SGPs of 44 (ELA) and 42 (math) in the first year and 40 and 62.5 in the second, while the other teacher had SGPs of 61 and 50 in the first year and 41 and 45 in the second.
Ø  A 5th grade teacher had median SGPs of 72.5 and 64 for two math classes in the first year, and 48.5, 26, and 57 for three math classes in the following year.  The second year’s classes included students with disabilities and English language learners, but the first year’s did not.
Ø  Another 5th grade teacher had median SGPs of 45 and 43 for two ELA classes in the first year, and 72 and 64 in the second year. The first year’s classes included students with disabilities and students with behavioral challenges while the second year’s classes did not.
As an experienced observer/evaluator, I found that median SGPs did not correlate with teachers’ teaching skills but varied with class composition.  Stronger teachers had the same range of SGPs in their classes as teachers with weaker skills, and median SGPs for a new teacher with a less challenging class were higher than median SGPs for a highly skilled veteran teacher with a class that included English language learners.

Furthermore, SGP data did not provide useful information regarding student growth. In analyzing students’ SGPs, I noticed obvious general patterns: students with disabilities had lower SGPs than students without disabilities, English language learners had lower SGPs than students fluent in English, students who had some kind of trauma that year (e.g., parents’ divorce) had lower SGPs, and students with behavioral/social issues had lower SGPs.  SGPs were correlated strongly with test performance: in one year, for example, the median ELA SGP for students in the “Advanced” category was 88, compared with 51.5 for “Proficient” students, 19.5 for “Needs Improvement,” and 5 for the “Warning” category.

There were also wide swings in student SGPs, not explainable except perhaps by differences in student performance on particular test days.  One student with disabilities had an SGP of 1 in the first year and 71 in the next, while another student had SGPs of 4 in ELA and 94 in math in 4th grade and SGPs of 50 in ELA and 4 in math in 5th grade, both with consistent district test scores.

So how does this “information” impact real people in a real school?  As a principal, I found that it added nothing to what I already knew about the teaching and learning in my school.  Using these numbers for teacher evaluation does, however, negatively impact schools: it demoralizes and discourages teachers, and it has the potential to affect class and teacher assignments. 

In real schools, student and teacher assignments are not random.  Students are grouped for specific purposes, and teachers are assigned classes for particular reasons. Students with disabilities and English language learners are often grouped to allow specialists, such as the speech/language teacher or the ELL teacher, to work more effectively with them.  Students with behavioral issues are sometimes placed in special classes, and are often assigned to teachers who work particularly well with them.  Leveled classes (AP, honors, remedial), create different student combinations, and teachers are assigned particular classes based on the administrator’s judgment of which teachers will do the best with which classes. For example, I would assign new or struggling teachers less challenging classes so I could work successfully with them on improving their skills.

 In the past, when I told a teacher that he/she had a particularly challenging class, because he/she could best work with these students, he/she generally cheerfully accepted the challenge, and felt complimented on his/her skills.  Now, that teacher could be concerned about the effect of that class on his/her evaluation.  Teachers may be reluctant to teach lower level courses, or to work with English language learners or students with behavioral issues, and administrators may hesitate to assign the most challenging classes to the most skilled teachers.

In short, in my experience, the use of this type of “value-added” measurement provides no useful information and has a negative impact on real teachers and real administrators in real schools. If “data” is not only not useful, but actively harmful, to those who are supposedly benefitting from using it, what is the point?  Why is this continuing?