Copy
📌 The Grade publishes media commentary on Wednesdays & a roundup of the week's best education coverage on Fridays. Sign up here! 📌

The story behind Burbio, the school data company journalists rely on

Digging deep into a little-known company that became the media’s go-to source for school closing data during the pandemic.
 
By Betsy Ladyzhets
  

In fall 2020, a small, little-known company based out of Pelham, New York, with no education or health ties suddenly became a key source for journalists covering the pandemic’s impact on K-12 schools.
 
Early citations of this firm – called Burbio – included USA TODAY, Chalkbeat, and CBS Evening News. Soon, reporters at the New York Times, US News, ABC, NPR, and others were relying on Burbio for statistics summarizing where schools were open for in-person versus remote learning.
 
In the intervening months, Burbio has also provided data to news outlets regarding mask and vaccine requirements, as well as ESSER spending. Many education journalists eagerly await issues of the company’s Sunday newsletter, using it as a key source for story ideas.
 
Burbio stepped up to fill a void left by the federal government, which has never collected comprehensive or real-time data on COVID-19 in schools despite its unique capacity to do so.
 
But from a science, health, and data reporter’s perspective, Burbio’s efforts raise several red flags. The company does not clearly disclose its dataset’s limitations, nor does it disclose its funding sources. Its data are not publicly available for researchers to vet. The popular data on school “disruptions” are easy to misinterpret when cited without context.
 
If any researcher attempted to publish reports with these issues in a scientific journal, they would need to carefully caveat their results to get the work accepted. If any news outlet cited these data without alerting readers to the data’s drawbacks or potential conflicts of interest, it would be considered irresponsible.
 
And yet, journalists and even the federal government now rely on it.
 
From a science, health, and data reporter’s perspective, Burbio’s efforts raise several red flags.
 
The need for data about school closings and reopenings first became apparent in summer 2020, as schools prepared to bring students back to physical classrooms after a uniformly remote spring semester.
 
Education Week, which had created a much-used school closing tracker during the spring of 2020, stopped updating information in May 2020.
 
Under the Trump administration, the federal Department of Education had explicitly abdicated its responsibility for producing data.
 
Biweekly state-level data from the Census Bureau’s Household Pulse Survey provided only an indirect measure of school disruption.
 
“Education reporters were just desperate” for comprehensive data, recalled USA Today education reporter Alia Wong.
 
Journalists were eager for new sources of information — a need that would continue into the Biden administration. Burbio felt capable of taking on the challenge.
 
“Education reporters were just desperate” for comprehensive data, recalled USA Today education reporter Alia Wong.
 
An ideal system for tracking COVID-19’s impact on schools would be based on regular surveys of individual school districts conducted by the federal government, said Emily Oster, Brown University economics professor and leader of other school COVID-19 data efforts.
 
“It is difficult to know exactly what’s going on in K-12 schools, whether that’s, ‘Are they open?’ or, ‘How much COVID is there?’” Oster explained. “The reason it’s hard is that it’s very decentralized. And there’s not a standard reporting system.”
 
Surveying all 16,000 school districts in the country on a weekly basis would be highly time-consuming. But there’s a lower-lift alternative: pulling out a subset of large districts’ COVID-19 policies from their website or social media feeds.
 
Burbio — the name comes from the word “suburbs” — had pre-pandemic experience in this realm, as it had pulled and compiled data on events occurring at “well over 100,000 school, government, library, and community groups,” said co-founder Dennis Roche. Roche and his team sold the resulting databases to real estate and media companies.
 
Inspired in part by the COVID Tracking Project at The Atlantic, Burbio’s COVID-19 school tracking process is all manual, without automated web scraping. A team of 15 people works full-time on the tracker, according to Roche. They visit school district websites, social media pages, and other information sources, then log any changes in the districts’ learning modes and COVID-19 policies.
 
Each member of the team has an assigned set of school districts that they check every 72 hours, said Julie Roche, Burbio’s co-founder and data team leader. “By now they know their school districts by heart, and they know whether it’s a district that tends to just have pop-up announcements on the homepage, or the superintendent sends out weekly newsletters.”
 
Burbio’s tracking efforts focus on 5,000 core school districts, out of more than 16,000 total districts in the U.S. Of those 5,000, about 3,000 represent the largest K-12 public school districts in the country, Julie Roche said. The remaining 2,000 represent more of a variety, including smaller rural and tribal districts.
 
“It’s really meant to be a representative sample of everything going across the country,” she said.
 
While the Roches do not have a background in education or health data specifically, Julie Roche spent 15 years working in market research for firms such as Procter & Gamble, and Dennis Roche has previously worked on measuring consumer behavior for media companies.
 
Julie Roche spent 15 years working in market research for firms such as Procter & Gamble, and Dennis Roche has previously worked on measuring consumer behavior for media companies.

Burbio’s method “seems to work pretty well but is not foolproof,” said NPR education correspondent Anya Kamenetz in an email about her use of the data. “I consider them to be the most comprehensive source that is regularly updated.” [emphasis hers]
 
Similarly, Zach Parolin, a social policy researcher affiliated with Columbia University and Bocconi University (in Italy) who has also worked on school closure data, said, “Burbio does about as good as you can do without having either (1) the ability to require school districts to provide information or (2) other large-sample, nationally-representative data on teaching methods.”
 
Indeed, Burbio’s dataset covers as many public school students as possible through its focus on the country’s largest districts. But this method misses smaller and more rural districts, which may be more likely to be operating with fewer safety measures – and are also less likely to have the resources to update the public websites or social media pages that would be caught by Burbio’s team.
 
For Burbio’s popular data on in-person learning disruptions, the team supplements its regular checks with Google, social media, and local news searches aimed at identifying school closures that occur outside the 5,000 core districts.
 
But the “disruptions” Burbio reports have a fairly broad definition: they include both switches from in-person to remote learning and full-blown school or district closures, and could be tied to high case numbers, staff shortages, or some combination of the two.
 
Burbio’s public dashboard does not specify.
 
The dashboard also typically doesn’t indicate how many school buildings – or how many students – are affected by a single “disruption.”
 
In early January, for instance, Chicago’s school district (the third largest in the nation) shut down entirely because of union negotiations. During this period, a note on the Burbio dashboard alerted users that about 650 of 5,400 closures in the week of Jan. 2 were singularly tied to the Chicago union, rather than being 650 distinct disruptions.
 
This note was later removed, because time had passed and the Chicago situation “is now one of many different unique disruptions,” the Roches explained in an email. Still, information on these unique disruptions should be readily available to reporters citing the data.
 
In addition, when it comes to school learning modes, Burbio only reports the options that are available to students — not the actual student numbers learning in-person or remotely. For instance, in late 2020, New York City schools would have been labeled as hybrid even though the majority of students were still learning entirely online.
 
Burbio only reports the options that are available to students — not the actual student numbers learning in-person or remotely.
 
When Burbio started tracking the pandemic’s impact on schools, the business was filling a void that “should not have been allowed by the federal government to last as long as it did,” said Nat Malkus, an education policy researcher at the American Enterprise Institute (AEI).
 
But now, Dennis Roche says that the federal government is Burbio’s largest client.
 
The U.S. Centers for Disease Control and Prevention signed an October 2021 contract for more than $600,000 for Burbio’s tracking during the 2021-22 school year, and the agency has contributed funding to Burbio since April 2021, according to a CDC spokesperson.
 
The company’s data are the primary source for a School Learning Modalities dashboard on the Department of Health and Human Services (HHS)’s COVID-19 data website.
 
The CDC utilizes a model that incorporates Burbio’s data along with information from other school trackers “to resolve issues of missing data and enhance overall data coverage when combining data from disparate sources,” the CDC representative said.
 
For example, MCH Strategic Data’s research effort tracking COVID-19 impacts on schools is also a major source for the dashboard. Its statistics are all publicly downloadable, and this company’s methodology is far more comprehensive but less current; these researchers actually call every public school district, every two weeks.
 
The AEI’s Return to Learn Tracker is another source for the federal government. This tracker covers school COVID-19 policies at 8,600 districts, compiled by scraping district websites on a weekly basis, Malkus said. While the Return to Learn team also tracked school learning modes in the 2020-21 school year, it now focuses on masks and other measures.
 
Burbio does not publicize its CDC funding on its website because the Roches consider the agency one of their many clients. As a private business, funding disclosures are not required for Burbio the way they are for academics and nonprofit institutions.
 
The CDC connection adds to Burbio’s credibility in some ways. Still, in the interests of full transparency for a major source of statistics in news stories, journalists and researchers – as well as their readers – should be aware of the company’s funding.
 
Burbio also doesn’t make its full data set available to the public. Through data agreements with the company, several research groups — such as Tulane’s Reach Center and Stanford’s Center for Education Policy Analysis — have vetted and utilized Burbio’s work. However, if Burbio were completely transparent, anyone would be able to download the data.

If Burbio were completely transparent, anyone would be able to download the data.
 
News outlets that cite Burbio often refer to the company as a “tracker” or “aggregator,” while failing to elaborate on the dataset’s limitations.
 
In a January 2021 story, NPR’s Kamenetz noted that Burbio “scrapes a selection of school websites” to compile its data; though “scraping” may not be accurate here as Burbio’s data collection is manual.
 
USA Today provided more details in a feature assessing school reopening’s success. For example, the story notes that Burbio’s data “don’t account for all the students who quarantined while their schools remained open, and may be an undercount in districts that do not share data publicly.”
 
“Always be as transparent as possible about the caveats,” said Wong, who worked on this story. She also recommended talking to academic researchers who are studying COVID-19 in schools and can provide expertise in thinking through big questions on this topic.
 
Journalists should also cross-check Burbio’s data against other sources, Parolin and Malkus both said. These may include the HHS dashboard, MCH Strategic Data, Return to Learn, and the School Closure & Distance Learning Database, which Parolin and a collaborator compiled from anonymized cell phone data.
 
Burbio’s de facto role as COVID-19 schools data expert sets a dangerous precedent for the future of education research and policymaking: a private company should not supplant the federal government in providing key information.

Or, if the government is relying on a private company, all the underlying data supported with public tax dollars should be made freely and openly available.

Every reference to Burbio should include a reminder of the data’s limitations and the government’s failure.
 
Journalists, at minimum, should note that Burbio’s data reflect only a sampling of school districts that is skewed towards larger, urban schools. They should explain that Burbio’s data only show COVID-19 policies available to students in each district, not what students are actually doing. They should explain how the CDC uses Burbio’s data. And they should supplement Burbio’s statistics with findings from other sources that are more comprehensive and transparent.
 
This company’s role as the go-to data source for education coverage is also a reminder that convenience shouldn’t trump accuracy or disclosure, especially when there are other data sources available.

Every reference to Burbio should include a reminder of the data’s limitations and the government’s failure.

Betsy Ladyzhets is a science, health, and data journalist focused on COVID-19. She runs the COVID-19 Data Dispatch, a publication that provides news, resources, and original reporting on pandemic data. She's also a journalism fellow at Documenting COVID-19, a public records, data, and investigative project supported by the Brown Institute for Media Innovation and MuckRock. Her work has appeared in Science News, FiveThirtyEight, MIT Technology Review, the COVID Tracking Project, and other national outlets.
 
Previously from The Grade
 
Making the map: How EdWeek devised a must-have pandemic resource (March 2020)
3 ways to measure school reopening trends (November 2020)
School disruption rates by state, August through December (January 2021)
 

 

Copyright © *2020* Alexander Russo's The Grade, All rights reserved.

Our mailing address is:
The Grade
742 Washington Avenue # 2L
Brooklyn, NY 11238

Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list






This email was sent to <<Email Address>>
why did I get this?    unsubscribe from this list    update subscription preferences
The Grade · 742 Washington Avenue · Brooklyn, NY 11238 · USA