10 hours ago

Why Trump’s undermining of US statistics is so dangerous | Daniel Malinsky

In 1937, Joseph Stalin commissioned a sweeping census of the Soviet Union. The data reflected some uncomfortable facts – in particular, the dampening of population growth in areas devastated by the 1933 famine – and so Stalin’s government suppressed the release of the survey results. Several high-level government statistical workers responsible for the census were subsequently imprisoned and apparently executed. Though the Soviet authorities would proudly trumpet national statistics that glorified the USSR’s achievements, any numbers that did not fit the preferred narrative were buried.

A few weeks ago, following the release of “disappointing” jobs data from the Bureau of Labor Statistics (BLS), Donald Trump fired the commissioner of labor statistics, Dr Erika McEntarfer, and claimed the numbers were “rigged”. He also announced his intention to commission an unprecedented off-schedule census of the US population (these happen every 10 years and the next one should be in 2030) with an emphasis that this census “will not count illegal immigrants”. The real goal is presumably to deliver a set of population estimates that could be used to reapportion congressional seats and districts ahead of the 2026 mid-term elections and ensure conditions favorable to Republican control of Congress – though it is not clear there is sufficient time or support from Congress to make this happen. The administration is also reportedly “updating” the National Climate Assessments and various important sources of data on topics related to climate and public health have disappeared. In addition to all this, Trump’s justice department launched an investigation into the crime statistics of the DC Metropolitan Police, alleging that the widely reported decline in 2024 DC violent crime rates – the lowest total number of recorded violent crimes city-wide in 30 years – are a distortion, fueled by falsified or manipulated statistics. One might say that the charge of “fake data” is just a close cousin of the “fake news” and all of this is par for the course for an administration that insists an alternate reality is the truth. But this pattern may also beget a specifically troubling (and quintessentially Soviet) state of affairs: the public belief that all “political” data are fake, that one generally cannot trust statistics. We must resist this paradigm shift, because it mainly serves to entrench authoritarianism.

It was eventually a common sentiment in the Soviet Union that one could never trust “the official numbers” because they were largely manipulated to serve political interests. (At least, this is the sentiment reported by my parents, who grew up in the Soviet Baltic states during the 1960s and 1970s – I was an infant when we left in the late 80s so I cannot report much first-hand.) One upshot of this kind of collective belief, if it were to take hold, is that it can make one’s informational world quite small: if you can only trust what you can verify directly, namely what you experience yourself or hear from trusted friends and family, it is difficult to broaden your view to include experiences of people in circumstances very different from yours. This kind of parochial world with few shared reference points is bad for democracy and building solidarity across groups. It also makes it easier for an oppressive state to plant false and divisive “facts” to serve its goals; we’ll have a fake crime wave here and a booming economy there, and though maybe most people disbelieve this they do not quite believe the opposite either. No one can credibly claim or contest any socially-relevant trends because all numbers are fake, so the activities of claiming and contesting things become pointless – just do what you can get away with.

A political culture with no trust in data or statistics is also one that will rely more heavily on opaque decisions made by elites behind closed doors. In his influential historical study of the rise of quantitative bureaucracy, the historian Thomas Porter points out that basing policy decisions on calculated numerical costs and benefits reduces the role of “local” discretion and can have a homogenizing effect, which can strengthen centralized state control. The flip side of this coin is that it also divests people in power from part of their authority by enabling a degree of public transparency and scrutability: if a huge government project must be justified by reference to some cost-benefit calculations, these calculations can be cross-checked and challenged by various parties. If a government agency requires documentation of progress on initiatives, proof that public funds are being spent appropriately, and evidence on who benefits and by how much, there is substantially less room for plain corruption and mismanagement provided that independent parties have access to the relevant information. Without credible data that reflects the facts on the ground, how can the public push back against an invented “crisis” narrative, concocted to justify the invocation of emergency powers?

Anyone who spends any time working with data is acutely aware that there are lots of choices to be made in the collection or processing of data – there are numerous “decision points” about what to include, how to precisely define or measure things, and so on. Indeed, insofar as data is used to tell stories about complex things such as the state of the economy or the health of a population, different data collection or analysis choices can to some extent lend support to different narratives, including pre-determined narratives if an unscrupulous analyst is set on it. But it does not follow from this that “anything goes” or that statistics are meaningless. There are better and worse ways to collect and analyze data, both reasonable and preposterous ways to answer empirical questions such as “are crime rates in DC going up or going down?” Most importantly, when government statistics are managed by qualified and non-partisan officials and the relevant numbers can be challenged, debated, and contested, then we have a democratic basis for guiding our institutions to better policy decisions. Data of public importance must be publicly accessible, not hidden from view.

Trump’s assault on the integrity of data is not the worst of his ongoing abuses – the public should be more immediately outraged by the masked agents disappearing people on the streets and the national guard occupying city centers – but this pattern of actions vis-a-vis official statistics should be extremely alarming. It is a slow boil: if we reach the point where nobody trusts numbers because it’s all “fake data”, it will be too late to resist and too difficult to undo the damage. The opposition must block appointments of unqualified and clearly biased nominees to lead the BLS and other agencies responsible for data stewardship. We must resist undue interference in data gathering, whether that is at the level of the US census or at the level of city government. On the contrary, we should be investing in initiatives that strengthen public trust in and understanding of the social, economic and environmental data that can be used to guide decisions that affect our communities’ wellbeing.

  • Daniel Malinsky is an assistant professor of biostatistics in the Mailman School of Public Health at Columbia University

Read Entire Article

Comments

News Networks