Fact Check: 3 Examples of Data Gone Wrong#Data
Advancing K12 Staff
by Advancing K12 StaffRead time:
It’s a pretty alarming stat (albeit a relatable one), except...it’s probably neither true nor meaningful, and nobody knows where it came from. Even so, trusted media outlets continue to perpetuate the myth, long after it's been busted.
Piscine attention spans might explain the near-universal temptation to accept data at face value, especially when it shows up in outlets we know and trust. This is a credible source, we convince ourselves. The facts are irreproachable!
We use data to validate our opinions, support our arguments, and even make important decisions with lasting consequences. But we don’t always do a great job entering, vetting, or interpreting that data. In the paraphrased words of The Princess Bride’s Inigo Montoya, “I don’t think that data means what you think it means.”
Consider these three recent examples of problematic data:
1) An elementary school gets the wrong gradeSource: Dallas News, 2017
What happened?A coding error placed one transfer student in the wrong racial/ethnic group, dropping Dallas ISD’s Mount Auburn Elementary School from “Met Standard” to “Improvement Required” in the Texas Education Agency’s weighted accountability ratings.
In the end, this data entry mistake resulted in no lasting harm, but only because the school’s new principal caught the data error shortly after being assigned to the campus. Considering the potential sanctions Mount Auburn could have faced (to say nothing of the damage done to its reputation) had it remained in the “Improvement Required” category for an extended period of time, the quick resolution was a fortunate break for all involved.
What can we learn?It’s unlikely the person who entered the data initially was even aware of the potential ramifications, which makes for a great case study in the importance of appropriate training, checks, and balances for data entry. In an age when codes and reports can have a direct impact on your funding and your reputation, accuracy is paramount. School and district leaders can benefit from awareness of both how the data gets entered and which critical metrics are affected by which data points.
2) Burnt marshmallows: A failure to replicateSource: Psychological Science (2018)
What happened?In what is less an example of “bad data” and more an example of “bad misinterpretation of data,” the pervasiveness of delayed gratification strategies in elementary education can trace its roots to a 1990 psychology paper from Yuichi Shoda, Walter Mischel, and Philip Peake. Drawing on Mischel’s work at Stanford from the 1970s and 80s, the paper revisited the famous “marshmallow test,” in which a marshmallow was placed in front of a child, who was promised a second treat if they could resist the temptation to eat the first one for a given period of time.
The 1990 paper, Predicting Adolescent Cognitive and Self-Regulatory Competencies From Preschool Delay of Gratification: Identifying Diagnostic Conditions, followed up with 185 students who had taken the marshmallow test between 1968 and 1974. Among other findings, the paper reported a positive correlation between “delay time” and SAT scores. Despite the authors’ repeated attempts to emphasize the need for caution (“The value and importance given to SAT scores in our culture make caution essential before generalizing from the present study….”), policy makers, curriculum leaders, and interventionists pounced on it, developing content and crafting strategies to help students learn how to delay gratification.
When researchers failed to replicate the results in the above-referenced 2018 study (showing only half the correlation of the original and almost no statistical significance after accounting for other factors such as family background and home environment), the media had a field day. All marshmallow puns aside, if the original study hadn’t been cherry-picked and misinterpreted from the moment it went mainstream, we may have found a better use for the time and energy so many educators poured into it.
What can we learn?The marshmallow test is an important case study in two common data literacy issues with academic research. First, it exemplifies the replication crisis ripping through so many scientific fields right now (psychology chief among them). Second, it is a cautionary tale of too many people putting too much stock in what amounts to a very limited study. In this case, the second issue is even more egregious because the authors emphatically and repeatedly addressed those limitations, going so far as to include the line, “further replications with other populations, cohorts, and testing conditions seem necessary next steps.” It took almost 20 years, but we finally got those replications, and they appear to be right in line with the original recommendation of caution.
3) The school shootings that weren’tSource: National Public Radio, 2018
What happened?Gun violence in schools is a disturbing topic to write about, but we can’t tell the story of education in the 21st century without it. The gravity of the discussion makes accurate data that much more important—literal lives could well hang in the balance. When the U.S. Department of Education published its School Climate and Safety report in early 2018 (based on 2015-2016 data from the Civil Rights Data Collection), this line stood out as unbelievable to many readers:
“Nearly 240 schools reported at least 1 incident involving a school-related shooting…”
Could it be? We have undoubtedly seen more than enough stories about school shootings in the news, but that number seems extreme.
When NPR reached out to the schools in question, it was only able to confirm 11 reported incidents. In some cases, glaring discrepancies were easily dismissed (i.e. the 37 shootings reported by Cleveland Metropolitan School District, a result of data being entered on the wrong line of the form, or the 26 shootings in the Ventura Unified School District, where an administrator said “someone pushed the wrong button"). In other cases, the government’s instructions were simply misinterpreted, with districts reporting cap gun incidents, instances outside of the measured dates, and even a student taking a picture of himself with a gun as “school-related shootings” during the reporting period.
What can we learn?This is Occam’s razor at its finest. The biggest takeaway for the education discussion is this: If a number doesn’t look right, no matter how credible the source, it’s ok to ask questions. State and federal reporting is an onerous process for many schools and districts, and even one false assumption or misinterpretation can lead to misreported numbers. The agency on the receiving end often lacks the resources to scrutinize and clarify every single data point, leaving them to rely on the veracity of what the district is reporting. It falls on all of us to ensure we’re not making data-driven decisions based on bad data.
The famed author Anton Chekhov once said, “Knowledge is of no value unless you put it into practice.” But when the “knowledge” is based on false assumptions, problematic sources, or a misinterpretation of facts, the resulting “practice” is likely to exacerbate the original problem or cause new ones.
Data literacy and news literacy might be two of the most important skills we can transfer to students, but we have to start with ourselves. As we learned from the “goldfish attention span,” the spread of misinformation can be as difficult to stop as the proverbial runaway train. Let’s work together to keep these conversations on the rails.
Follow-up resource: More articles for data enthusiastsIs data your thing? Want to stay on top of the latest in privacy, strategy, and interoperability? Bookmark the Advancing K12 Data Showcase page and subscribe to our newsletter for more like this.
|Advancing K12 Staff Edtech Thought Leader