Artificial Intelligence: Going beyond the data to fix the flaws
Artificial Intelligence: Going beyond the data to fix the flaws9 June 2021 | 14min
Flaws in artificial intelligence (AI) exist due to bad or historical data and a naïve assumption that AI, like humans, can explore the unknown
In order to use the power of AI to solve problems, these problems and how to solve them first need to be understood on a human level
AI is no longer in its infancy and the chance to bring together the best of what people do and the best of what machines do to improve people’s lives is here
Artificial intelligence (AI) can do a superhuman job at a fraction of the cost than a human being. Yet, there are flaws and concerns about AI, and in healthcare settings, this rightfully causes hesitancy in adopting AI-based tools.
While AI can only be as good as the data used to train them, there is another aspect at play – the people behind it. Vivienne Ming shares why defining the right question and knowing how to solve it must be done on the human level before expecting AI to get it right for us.
AI flaws stem from more than just the data
HT: Considering the expression, “AI and machine learning are only as good as the data you feed it”, what projects have you recently been involved in that highlight this and what are the implications?
Vivienne Ming: While there is a notion that machine learning and AI are only as good as the data you give it, I want to go even further to say, it’s only as good as the people you put behind it.
It’s the people that are making the decisions about the data and about what problem is being solved. Here’s this very modern issue in machine learning, that goes by a fairly fancy title of “underspecification”, and it’s a formal challenge that machine learning has to overcome. It’s almost entirely about people and data, not about algorithms.
Since it is so clear, I’m going to start with the example of hiring to help illustrate this. One of the very first cases where AI was used for hiring was with a company called Gild1 – their platform combed through all the data you can collect about an applicant, everything that’s public, everything that might have been shared with them – and would use that to help make better and faster hiring decisions. Think of it as precision medicine, except the prescription is who gets the job.
Decades of research, as well as Gild’s own internal user testing, revealed that recruiters focus almost exclusively on 3 elements when reviewing a resume: your name, school and last job.2
Based on hundreds of thousands of data points we collected for over 100 million people, we found that those 3 factors are incredibly predictive of whether you will get hired, but not predictive of the quality of work you’ll do.3,4
The issue is similar in a study that aimed to use predictive analytics to call back patients who would benefit most from follow up checkups. They had a lot of data on all the patients and a noble goal to bring these people back in before a minor health issue became a serious one.
The presumption, albeit naive but well-intentioned, was that the amount of money being spent on these patients would be a good indicator of how serious their problems are.
So, when we train our algorithm to decide who to bring back in for a follow up we focus on those that we’ve ended up spending more money on. This is similar to training an algorithm to identify who to bring back for an interview based on who got the job historically.
Well, guess what? It turns out, it’s the same population in both cases.
The route of biased AI begins with people
Vivienne Ming: The people, at least in the tech industry, who historically got brought back in for an interview and got the job were white men, and the people who historically had more money spent on them were white patients. Neither case is meant to indicate that there was racism at play in doctors’ decisions in the moment, but historically, of course, there was.
Women were not given tech jobs. Non-whites, particularly blacks and Latinos were not given tech jobs, and so even though now people were using algorithms specifically to find new candidates, what ended up happening is the algorithms wanted the same old people to come back in to interview for the jobs. That is exactly what happened for this public health preventative medicine algorithm, because historically, more money had been spent on white patients.
There’s a huge body of research about why this is. For instance, pain was taken more seriously if a white person said they were in pain. Pain was taken more seriously when a man said they were in pain. For a wide variety of reasons, there is bad historical data on health, on jobs, and on education.
When you pair it with a very naive idea that algorithms are somehow magically unbiased and that they don’t fall prey to the same mistakes that we make—they also just learn from history—then algorithms will be inevitably flawed. AI is taking bad data and producing bad diagnostics, bad imaging. But the truth is, it starts with people.
How would an algorithm know that, in history, more money was being spent on one group than another? It couldn’t. It would take us going back and thinking hard, what do we want in the future that might be different than what we had in the past? That’s a fundamentally creative process. That’s one of exploring the unknown and the one thing machine learning cannot do today.
AI is an astonishing system for turning an unbelievably large number of numbers into more numbers, and that is more valuable than it might sound, but it cannot explore the unknown.
Being thoughtful, being human is fundamentally required to overcome bias
Vivienne Ming: We need human beings to be a part of this process in both of these cases, hiring and prophylactic diagnosis. In the hiring system Amazon built based on these same principles, it would not hire women and nothing they could do could remove that flaw from the system.5
In the case of one algorithm developed to identify patients for follow-ups, it was discovered there was a strong bias to bring in white patients who had no more risk than their black neighbors.6 The reason for the bias was that the developers had assumed costlier patients were sicker, but failed to recognize that more money was being spent of white patients. As a result, this unintended but horrific error could be directly attributable to the death of thousands of people who weren’t brought in for these follow up visits.
So, these sorts of things have consequences even though no one intended for there to be bias in the system. Being thoughtful, being human with that data is what’s fundamentally required because I don’t think we can have the excuse anymore. Oh, my goodness, how could we have known? Well, now we do.
This is not a baby or infant field anymore, we’ve learned a great deal, and now we have the chance to really bring together the best of what people can do, and the best of what machines can do in a way that can improve people’s lives.
Optimizing AI to solve questions the way we need it to
HT: How do we make sure that the values that the AI or algorithms are being optimized to have are the same values that we as the patients or doctors expect? In other words, how do we ensure alignment?
Vivienne Ming: There’s a term I use as a technical problem called underdeterminacy, which I’ll explain as a metaphor from my life. I’m a neuroscientist. I happen to use AI to study the brain, and nowadays to study a great many things. But for me, it’s always been an applied art. It is messy, arcane and complex, and there isn’t an absolute right way to do it because it is so complicated and there are billions and billions of data points, and hundreds of thousands, sometimes millions of parameters in these networks.
Imagine the biggest companies in the world and trying to get them all aligned in a precise way to do one thing. A million employees all passing information in a highly controlled way. That’s sort of what a deep neural network is, which is the modern face of AI.
When you’re doing this in a human domain – we’re talking about things like hiring, medicine, education, anything that involves the complexity of psychology or politics – suddenly these problems stemming from the complexity take on a new face, and it becomes very hard to actually articulate what the problem is, even for us humans.
Identifying the problem is critical to finding an answer, but not as simple as it seems
Vivienne Ming: What is the problem behind sexism and hiring? What is the problem of what’s actually going on with COVID-19? I think that the general public has this sense of, “Gosh, this is just a virus like the flu, why has it taken us so long, and we still don’t feel like we truly know what’s going on? Why is this all so complicated?”
As doctors and scientists, we understand these things are phenomenally complicated. The possible ways in which this little virus is interacting with our bodies are almost innumerable.
We have this problem in general in AI. There are so many ways, so many factors that could be at play in hiring, in medicine, and beyond, that AI doesn’t know the difference. It doesn’t know the problem any better than we do. What it’s doing mathematically, is trying to find this one place. There’s a giant map and it’s trying to find one location on that map, which is the best possible solution to a problem.
Here’s where we come back to this issue with the data again. If I ask AI, “look at these X-rays, and tell me whether it looks like this person has COVID-19?” The AI is good at running through all the training data and examples you give it but, when you go out into the real world, suddenly they fail. And they fail in novel ways that you have never seen before.
Ultimately, in any meaningful human problem there are orders of magnitude more wrong solutions than there are right ones. If we as humans don’t know what a right answer looks like, how could we possibly imagine an algorithm will figure it out for itself?
There have been huge advances in fields like reinforcement learning, and what’s called generative adversarial models where two AIs essentially play games against one another. One tries to trick the other. The main model tries to diagnose COVID-19, and the other model tries to trick it, and they can get incredibly good with the data they’ve been exposed to.
Of course, we’ve all heard of things like AlphaGo and AlphaZero from DeepMind that can play Go better than any human can play, or play StarCraft or Halo, the computer games, comparable to the best human players in the world. The thing is though, if you changed even a tiny bit of those games, they’d fall apart, and the real world is full of those complexities and unknown situations.
AI cannot possibly know the right answers if we don’t. It is a tool. But, if we can articulate the right answers correctly, it can do a superhuman job, and it can do it at a fraction of the cost of a real human being.
Aligning AI with making people better
Vivienne Ming: In a domain like medicine, and public health and wellness, it’s fundamental that we align AI with making people better. And I mean this in every sense, but in this context, I mean to support nurses, doctors, and patients. It shouldn’t be an oracle that gives you a pronouncement and you just carry out its orders without any idea what’s going on. We should build systems which are for example, transparent, so you know why it’s suggesting what it’s suggesting.
Effective AI empowers people to make their own decisions
Vivienne Ming: That same sort of concept is something I’m working on right now, which is a project around gender and health and combining genetics, hormone, neuroimaging, and behaviour to try and build an incredibly rich model of what it means to be you. Because it turns out gender and biological sex both influence a great many things, mental health, biological, traditional medical health.
For example, women often have more challenges with bipolar disorder than men do, but in fact, the same kinds of precursors and predictors in women can be identified in men, and it actually drives similar predispositions. Realizing that it’s not about whether you identify as a man or as a woman, or whether you’ve got certain chromosomes, we can get at the underlying causal mechanisms.
Then we can go back and empower a patient and say, “Hey, we’ve learned these things about you, about the richness that is your personal identity, and here are some things that you should really pay attention to, and here are some actions you can take that have been shown to be effective with individuals like you.”
Genetics is not a prediction of the future. People with a genetic predisposition, whether it’s to bipolar disorder or obesity, are in fact more responsive in many cases to intervention than people that aren’t. It’s incredibly powerful to use AI in this rich way. To learn about someone, but then, not to just send them a bunch of numbers or a report, but instead use it so that they can learn more about themselves and make their own decisions. For all of this to be really effective, you have to start in a human-centered way.
Where to start in creating an AI solution to solve a problem.
HT: In your experience, how do you best go about solving a problem or challenge with AI? What advice would you give to those wanting to do the same.
Vivienne Ming: When we start a new project at Socos Labs, and we work in education and workforce, health and mental health and everything you can imagine involving people, we never start with the data. We never start with the algorithms. I start with two core things.
The first thing I do is read every scientific paper I can find on the subject because we know a lot more than I think we often give ourselves credit for. As neuroscientists, we know so much more about the brain than we knew before, and there is still so much more to learn.
We compliment these deep dives into the research literature with equally deep dives into the lived experience of the problem, even before there is data. My favorite example is the Make-A-Wish Foundation. Does granting a wish to a dying child actually increase survival rates? Research suggests that it does, so we want to go even deeper, for instance, what about the nature of the wish?
We have these tantalizing clues, for example, that for a certain child, a social wish—for example, bringing your best friends with you to Disneyland—has that impact. But for another child, a narrative wish may produce the best outcome. If a child wanted to meet Captain America, instead of a simple meet-and-greet with Chris Evans, our interventions suggested getting together with Captain America to fight a crime for a day, making it a narrative experience for the child. These more impactful wishes have a profound ripple effect, changing other community outcomes like divorce rates and survival rates of these families.
That project didn’t start with data, because quite frankly, the Make-A-Wish Foundation really didn’t have any data to work with. They weren’t the kind of organization to build massive data pipelines; they just wanted to bring a smile in a dark moment. So, that project started with us going out and watching the wish granters ring doorbells, and seeing the heartbreak going on in those families. In watching those human moments, we were looking for not just where to collect data but when to add a positive insight back into the process. Until we knew this, until we’d explored that unknown question, there was no place for data or algorithms.
Then at that point, we can begin to look at things like, what if we get photos of the families, photos that are already being taken by journalists and we monitor things such as touching, in which relationship psychology for years has demonstrated that changes in touching rates between families are predictive of family stability? So if we’re worried that a family might be breaking up or staying together, if we just wanted to know what kind of a wish improved their odds, we can actually get it out of these photos.
But we never went to the data side until we understood the human side.
Vivienne Ming, PhD Frequently featured for her research and inventions in The Financial Times, The Atlantic, The Guardian, Quartz and the New York Times. Dr. Vivienne Ming is a theoretical neuroscientist, entrepreneur, and author. She co-founded Socos Labs, her fifth company, an independent institute exploring the future of human potential. In her free time, Vivienne has invented AI systems to help treat her diabetic son, predict manic episodes in bipolar sufferers weeks in advance, and reunited orphan refugees with extended family members.
- Richtel. (2013). Article available from https://www.nytimes.com/2013/04/28/technology/how-big-data-is-playing-recruiter-for-specialized-workers.html [Accessed May 2021]
- The Ladders. (20212). Study report available from https://www.bu.edu/com/files/2018/10/TheLadders-EyeTracking-StudyC2.pdf [Accessed May 2021]
- Van Iddekinge et al. (2019). Pers Psychol 72, 571-598
- Socos Labs. Company website available from https://socos.org/ [Accessed May 2021]
- Dastin. (2018). Article available from https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G [Accessed May 2021]
- Obermeyer et al. (2019). Science 366, 447-453