Episode 90: The Potential for Bias in AI with Clare Corthell
Hi, everyone is John Prine from the Georgian impact podcast a few months ago. I spoke to tell the founder of luminant data about bias in machine learning models. What's the implication that buys could be and how to avoid it in the first place? It was episode number 74. You should definitely check it out. If you haven't already heard it now. I know that you're busy which is why we try to keep our podcast short and sweet just have extra content that we think is compelling enough to share as a bonus episode. That is a good example. He would pick things up with the conversation left off. We talked about the potential for bias in a high-powered services like Airbnb how to leverage a i and you would give abilities and how to recognize how the objectives of a system that was set up to have a negative impact 9 minutes, but it's worth a listen. Here it is.
You know we talked about the spectrum of where check just works and in my mind, that could be Robo investing and yeah, there's not a lot of humans there and you'll know if one Robo investment companies better than other by looking at the results in the year. So I'm not struggling in terms of bias there. It's pretty clear what the results are. What about the elements of bias in Airbnb in terms of people having bias as a run names and where they run to and stuff what's your sense of how something like that needs to be tackled? So there's this concept that come in two more of the machine learning and In fairness community that has to do with assisting human decision-makers and the subtleties that come into decision systems. So for example, the most common place where we see a decision assist or or decision-making
Assisting systems is in healthcare and it involves things like this person's diabetes has trended upward maybe we should take a deeper look at what's going on here and calling attention to certain things that gets really complicated and Healthcare, but I think you're our example with Air B&B is is quite appropriate. So one of the things that you can think about in a system like that is how is that person assessing that scenario? Are they looking at that person's name and a picture and deciding whether they trust them based on that alone. If that's true. How do we change the basis of their decision in order for them to make a better one? Because it's possible that Airbnb has a perfect understanding of which people should be rented to I would guess that they have a better understanding then then someone who's renting their property out, right? So how do you take that special knowledge and supplement the decision of that user and
Say hey, this person has rented this many times. They've never had to default on the deposit because they've always been super clean the rent we'd always going to leave the apartment cleaner than then we found it and there have been in those types of comments added to rental sites. Those are those are unstructured and real important a kind of add to a data set positive nature decisions. The decisions that people are making to the right information in and not let them default to these cultural bias. Now. We talked earlier about kind of decision-making in hospitals and looking at patient's not a lot of data. So I think about the old IBM Watson Jeopardy game at all though, you know Watson played the game the cool part about it where they showed you the top three answers with a likelihood. So there might young Watson always pick the nun.
One and say this is a 72% likelihood the option to be a 42% and often when Watson was wrong option to was the correct one. So my view is that the really great way for AI to allow doctors or nurses are medical professionals to make decisions without saying here's the answer. He's the range of things and allow your human skills to augmented and kind of do a better job than just like chess players humans and computers doing this Advanced chess can be either a human or a computer same kind of thing. Right? Right, right. I like this point to because it's possible that the computer is as you say finding the right answer. It just doesn't have a good sense of which which answer is are are the most likely to it has some menu of things that are that includes the cracking. It just can't quite get to this single correct answer and a human might be able to
They're really easily. So just narrowing that field and allowing that decision to be more at hand makes a lot of sense. The other thing that I think is important to pull into this which is something that I told Gavin detox about a lot has a book called complications that I think it's like 15 years old about the is cognitive bias that doctors have doctors are more likely to diagnose things that they are familiar with or have seen recently. So there's a reason see bias in medicine in particular and Technologies like Watson perhaps I'm not very familiar with it. But Technologies like we could allow doctors to see more rare conditions because they're harder to diagnose simply because of that that human cognitive bias and bringing a supplemental decision-making system into the Mexican supplement that you and I are both big fans of a Cathy O'Neil in her weapons of mass destruction book. She know she obviously has cancer.
Biased data and one of the chapters talks about bias data that was driving stop and frisk, which at the time was considered very successful. It's been stopped in some cities. This is all recent news articles in the alternative. I guess. It's I guess it's fair to call the community engage policing is actually having better results. How do you think about that is a case study as well as humans begin to think about the data to bring to any system.
I think this is a really really rich example. So I just a side note. I was just at Sundance and saw the world premiere of a film called Crime and Punishment which is about exactly this. It's a perfect distillation of how a metric can drive very very poor outcomes. So there were quotas for a long time and NYPD and stop and frisk and part was a an impact of this quote is existing and despite the quotas being removed from the system and and
Officially removed from the system. There are still departments that are basing their work on Quote us because it's a very it's a very cut-and-dry easy way to manage a group of people to say you brought in X things this month you either hit the threshold or you didn't you're either on track and doing well in your position or you're not and it's it's in a sense a lazy metric that isn't evolving to fit the outcome that you actually want. I think there are there different outcomes that that police forces one, but the primary one is to keep the peace every thinking about keeping a piece and yet you're holding these quotas and saying we need to bring in this many people this month for a violent crime and that many people have not committed violent crimes you will end up bringing an innocent people that will happen. Right? So that metric is starting to reinforce an outcome that
Is unintended and is not in line with the goal of keeping the peace. And what sound does a great job of constructing that but I think for our purposes it's important to take about the optimization Meshach, which is essentially what what we're talking about when were thinking about quotas and the goal of almost any system is impossible to encode in strict metric terms, and it needs to evolve with the application or with the the group of people and what I think is appealing about Community engagement policing is that it is much closer to that outcome of keeping a piece and it measures itself based on something that's much closer to peace and much more distant from the the tools the tools of policing being, you know handcuffs in and being able to bring to someone in and a courtroom and all of these other employee.
That we use in extreme circumstances, but the goal is really not to use them. The goal is to ensure that the society is safe.
Thanks for listening if you like what we doing, we'd appreciate your telling other people about the show better yet. Give us a 5-star rating on iTunes or write a review doing so will really help ensure that more people can find us and if you haven't already please subscribe to the impact podcast on iTunes, SoundCloud or have you go to find your podcast.
A few months ago, Jon Prial spoke with Clare Corthell, the founder of Luminant Data, about bias in machine learning models, what the implications of that bias can be, and how to avoid it in the first place. It was episode 74 and you should definitely check it out if you haven’t already heard it. In this short bonus episode of the Impact Podcast, they pick up where the conversation left off.
You’ll hear about:
- The potential for bias in AI-powered services like Airbnb
- How to leverage AI and human capabilities
- Recognizing how the objectives of a system can have a negative impact
Who Is Clare Corthell?
Clare Corthell is a Data Scientist and Engineer with experience leading teams, building custom products for diverse business needs, building machine learning and natural language processing systems, and defining corporate data strategy. Over the last few years, she has built and managed Luminant Data, consulting on building more transparent predictive systems. Clients include startups, software companies, media businesses, R&D firms, and international nonprofits. Trained as a product designer, she focuses on building intelligent products with iterative, testable prototypes on the path to production. In addition, Clare is the author of The Open Source Data Science Masters, leads initiatives to combat algorithmic harm and improve transparency, and built a ridesharing NGO in Nairobi.