Cathy O’Neil was a numbers geek from the get-go. Math camp as a teen. A math major in college. A job for a few years on Wall Street guiding hedge fund decisions as a quantitative analyst, or quant.
Then came the losses, panic and market collapse of 2008 — and a sudden feeling of shame.
Mathematicians, as O’Neil saw it, shared the blame, affirming the triple-A ratings accorded shaky mortgage-backed securities and facilitating the unhealthy growth of the nation’s housing bubble. “It was a losing-sleep-at-night, sick-at-your-stomach feeling. Like, what have we done?” she recalls. “If we didn’t have mathematicians in on the game, sort of stamping approval, we’d have had a much smaller problem. It disgusted me.”
She didn’t stop loving math. Just the way it was — and still is — being used.
O’Neil poured her concerns into the book “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy,” released last September. It’s a lament on today’s reliance on complex, often flawed algorithms and mathematical models, and here’s a takeaway: It hits closer to home than Wall Street.
That mortgage you’re paying off? You might have been stuck with a higher rate because of a calculation identifying people across your ZIP code as riskier borrowers. Maybe you didn’t get a job for which you were certain you were qualified; you were done in by the way you answered a personality test or by a lagging credit score from which assumptions were made about your trustworthiness and dependability.
Teachers are judged by their students’ test scores in reading and math. Colleges are ranked according to an algorithm drawn up by U.S. News & World Report. Prison sentences are affected by recidivism models that not only factor in prior arrests and convictions but also where defendants live, other brushes with police and the criminal records of friends and family.
That last instance bothers O’Neil the most. Criminal history and associations predict repeat offenses if that history entails violent crimes. But the numbers can be inflated by less serious offenses such as vagrancy, small-time drug sales and use, and aggressive panhandling. The poor and minorities become undue targets.
“It’s deeply biased. And people are losing their liberty, losing their freedom, unfairly,” O’Neil says. “It doesn’t get much more tragic than that.”
Simply put, algorithms are reducing many of life’s functions to a score. Few people are aware of their pervasive use or that many of the models are ill-conceived, and fewer still have the understanding or inclination to question them. Self-policing would be nice. But the intent of such statistical modeling is increased efficiency and cost-effectiveness, and O’Neil writes that practitioners in government and especially corporate America aren’t inclined to mess with the bottom line.
In Kansas City, data mining is perhaps most notably employed in crime deterrence. The Kansas City No Violence Alliance, whose partnering agencies include the police department and Jackson County prosecutor’s office, digests “documented police contacts” and computer-assisted findings on friendships, social media use and other activities to identify and communicate with high-risk individuals.
It’s not a true algorithm, prosecutor’s office spokesman Mike Mansur says. He also insists that the data focus on violent crime.
The program’s effectiveness can be debated. Homicides in Kansas City hit an 18-year high in 2016 and are on pace to climb further this year. At the same time, gang- and other group-related murders are down at least 10 percent since KC NoVa was launched in 2013.
O’Neil still sees the upside of Big Data, from helping people hit their musical sweet spots on Pandora to identifying the most probable households for child abuse. “I think about algorithms and A.I. (artificial intelligence) as being akin to the early car industry where the vehicles being put out were not road safe. People would drive them away from the dealerships and get into accidents and die,” she says. “We’re still in the era before people realize that algorithms aren’t safe and we need safety standards.”
O’Neil, 44, left her job at hedge fund powerhouse D.E. Shaw in 2009. Living in New York City with her husband, a Columbia University math professor, and their three sons, she blogs on mathbabe.org and launched a risk consulting and algorithmic auditing firm less than a year ago.
She recently discussed “Weapons of Math Destruction” and how Big Data is affecting lives. Excerpts are edited for length:
Q: The book is affirmation for everybody who hated math in school, right?
A: I hope not. That was not my goal. In fact, that’s the opposite of my goal, which was to defend the honor of math. It’s been deployed dishonestly, so I can understand why people are angry and why they start to distrust math. But I’m trying to explain: It’s not math that’s being dishonest with you. It’s marketing, mostly.
Q: Has the book made a difference? Are you and others who are preaching caution about algorithms giving people pause?
A: I think it has made a difference in a general-public skepticism kind of way. But let me put it this way: I haven’t seen any venture capitalists refusing to fund startups with Big Data business models because of the book. They’ve insulated themselves from bad news in so many ways that it’s hard to really address them.
Q: Are you hearing from other WMD victims, wanting to share their stories, since the book came out?
A: Oh yeah, that’s happened a lot. I get a lot of emails. I also get an enormous amount of love from professors, especially in sociology, anthropology and philosophy, (and in) mathematics, computer science and statistics, who have read the book and had their suspicions confirmed. Many of them are using it as a textbook for their students. So I think the ideas are being disseminated. It’s just not changing people’s business practices yet, and I think part of that is Trump becoming president.
Q: Is it difficult for data scientists to accept the downside of their work? Does that impede progress? Or is it more on the corporate side?
A: You’ve identified something really important. Sometimes, it’s more profitable to be unfair than it is to be fair, and thinking very hard about this stuff is not good for the bottom line. Fairness is a cost. It’s an added restraint.
There also are quite a few examples, like the value-added models for rating teachers and recidivism risk algorithms, that really aren’t about profit per se. You’d think they would care about fairness and justice. I think there’s been some movement in that direction; I don’t think it’s a totally lost cause. In the case of teacher value-added models, I’m guessing that is slowing down.
Q: Talk a little more about recidivism risk scores.
A: I write a lot about bad proxies in the book, and arrest records are bad proxies for crime. Lots of crime goes unpunished and doesn’t lead to arrests. You have whites and blacks smoking pot at similar rates but blacks getting arrested five times more often. That’s a strong bias. If you’re an average kid who smokes pot and you’re black, you’re more likely to have a record than if you’re white. And the fact we ignore that speaks volumes.
Q: You point to Amazon and pro sports teams, which embrace predictive analytics, as two entities trying to get it right, that assess and tweak their mathematical models. What does it say that they have a conscience and the criminal justice system doesn’t?
A: It’s interesting that you phrase it that way. It’s not that they’re trying to be nice people. They’re trying to make their systems as efficient and effective as possible, and those incentives are aligned with having better algorithms. Where, if you think about the Department of Justice, the data is all over the place with state and federal prisons and (local) jails. It is kind of a mess with regard to data collection. Even if it weren’t, the incentive there always has been “cover your ass” more than “make the system efficient.” I think that’s a huge difference.
Q: Does this cry for government regulation? And how likely is that when sentiment in Washington may be toward less regulation?
A: In an era where the EPA is getting no love, it’s hard to imagine protecting ourselves from things that are less concrete than chemical pollutants in rivers. At the same time, when enough harm is demonstrated, people demand protection.
Q: Do we hope corporations get a conscience?
A: No, no, no. That’s not going to happen. Think about the tobacco industry. I’m fully expecting that kind of lobbying effort to continue. Sadly, one approach to thinking about protection against horrible algorithms is that we’re going to have to wait until the evidence is as strong as it was that smoking causes lung cancer.
Steve Wieberg, a former reporter for USA Today, is a writer and editor for the Kansas City Public Library.
Join the discussion
The Kansas City Star partners with the Kansas City Public Library to present a book-of-the-moment selection every six to eight weeks. We invite the community to read along. Kaite Mediatore Stover, the library’s director of readers’ services, will lead a discussion of “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy” by Cathy O’Neil at 6:30 p.m. May 15 at the Mid-America Regional Council (MARC), 600 Broadway. If you would like to attend, email Stover at email@example.com.
O’Neil will participate via Skype. The discussion also will feature MARC senior researcher and kceconomy.com blogger Jeff Pinkerton.
From Chapter 10 of “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy” by Cathy O’Neil, published by Crown. Here, she writes about the prospect of Facebook, with its 1.9 billion users, gaming our political system.
“The potential for Facebook to hold sway over our politics extends beyond its placement of news and its Get Out the Vote campaigns. In 2012, researchers experimented on 680,000 Facebook users to see if the updates in their news feeds could affect their mood. It was already clear from laboratory experiments that moods are contagious. Being around a grump is likely to turn you into one, if only briefly. But would such contagions spread online?
“Using linguistic software, Facebook sorted positive (stoked!) and negative (bummed!) updates. They then reduced the volume of downbeat postings in half of the news feeds, while reducing the cheerful quotient in the others. When they studied the users’ subsequent posting behavior, they found evidence that the doctored news feeds had indeed altered their moods. Those who had seen fewer cheerful updates produced more negative posts. A similar pattern emerged on the positive side. … In other words, Facebook’s algorithms can affect how millions of people feel, and those people won’t know that it’s happening. What would occur if they played with people’s emotions on Election Day?”