Ep 141: How To Understand and Fix Biased AI
Join the discussion: Ask Nick and Jordan questions about AI
Check out the upcoming Everyday AI Livestream lineup
Connect with Nick Schmidt: LinkedIn Profile
Artificial intelligence (AI) has become an integral part of modern business, offering advanced solutions to complex problems. However, discussions surrounding bias and discrimination in AI models have brought to light the potential ethical and societal implications of these technologies. As AI continues to evolve, it is crucial for businesses and decision-makers to understand and address bias in AI models to ensure fairness and inclusivity.
Biased AI models can produce discriminatory outcomes, which can have significant real-world impacts. Understanding the root causes of bias in AI is essential for identifying and mitigating these issues. Input data, societal biases, and the rapid development of large language models are among the primary contributing factors to bias in AI. It is paramount for businesses to critically evaluate the sources and fairness of input data to prevent the perpetuation of biases in AI systems.
Addressing bias in AI models requires strategic and practical approaches. To ensure fairness, businesses can engage in a burden-shifting process, leveraging legal frameworks that emphasize the responsibility of various stakeholders in addressing bias. Additionally, companies can employ AI software designed to detect and mitigate bias, offering a proactive means of identifying and rectifying discriminatory outcomes.
Furthermore, the dialogue surrounding AI bias highlights the need for best practices and strategies to prevent bias in everyday AI applications. By implementing rigorous testing for bias and continually refining the AI development process, businesses can work towards mitigating the impact of bias in their AI systems.
Ultimately, businesses and decision-makers play a pivotal role in addressing bias in AI models. By prioritizing fairness and inclusivity, and being critical of the potential biases in AI-generated content, organizations can contribute to fostering a more equitable and transparent AI landscape.
As AI continues to advance, the onus falls on businesses and leaders to proactively address bias in AI models and ensure that these technologies reflect the diversity and inclusivity of the communities they serve.
Topics Covered in This Episode
1. Prevalence of Bias in AI Models
2. Detection and Mitigation of Bias in Algorithms
3. Practical Solutions for Addressing Bias in AI
Jordan Wilson [00:00:19]:
Why are AI models so biased? It’s pretty bad. Right? Like, whether we’re talking about chat GPT or AI image generators or just about any large language model, we see so many Biases and prejudices come out in these models. And today, we’re gonna talk about maybe how to understand and fix biased AI. I’m really excited for today’s conversation. It’s gonna be a good one. It’s actually a first for everyday AI, which I’m excited about. So, hey, welcome everyday AI. My name is Jordan Wilson.
Jordan Wilson [00:00:53]:
I’m the host, and this is your daily live stream podcast, free daily newsletter helping everyday people like you and me Learn and leverage generative AI. So let’s today learn about why AI models are biased. But before we do, Quick reminder, if you’re joining us live, thank you. If you’re listening on the podcast, make sure, as always, check your show notes. We’re always gonna leave links to some other great resources where you can read more about today’s episode, and listen to related episodes as well.
Daily AI news
So, before we get in to today’s show, let’s go over, as we do every day, the AI news. Alright. So Is AI coming to your hand? Yeah.
Jordan Wilson [00:01:38]:
Kind of. So start up company, Humane, is set to is set to debut the highly anticipated humane AI pin. So it’s a new AI powered smart device, that is set to launch today for a price of $699 and a monthly subscription fee of $24. So it is a screenless device That aims to go kind of beyond the typical smartphone and by providing, a lot of AI features such as translation services and music streaming. So the way this works is you kind of wear it on your shirt, and it literally beams the information to the palm of your hand as the display. I’m not sure if I want it because I have kind of, like, fat weird hands, so I’m not sure if I wanna stare at them all day. That’s just me. Alright.
Jordan Wilson [00:02:23]:
We have chip news. Yay. We love chip news on everyday AI. So, NVIDIA may be providing chips to China according to a recent report from the Financial Times. This is newsworthy because there were some recent restrictions on certain chip exports to China, but, some leaked documents that the Financial Times is reporting on are showing that, NVIDIA has developed Three new chips aimed at growing demand for the AI technology in China, but also at the same time complying with these new US export controls. So kinda making both parties happy. Alright. Last but not least, in our AI news for today, animated films are about to get a lot Cheaper.
Jordan Wilson [00:03:04]:
Thanks to AI. So, recently, the DreamWorks cofounder, Jack Katzenberg, said that AI will help Cut the cost of animated films by 90%. So be prepared for a flu of hopefully higher quality and even more animated films coming your way, especially as there’s these ongoing, and feuds with with the actors’ unions and the screenwriting guilds and all these things. So probably a lot more animated movies coming our way. And, hey, I’m a adult, and I’d say bring it on. I mean, like, have you guys seen Coco? You know? Thanks, AI. Hopefully, we’ll see a lot more of that. Alright.
Jordan Wilson [00:03:41]:
We didn’t come here to talk about animated films. Actually, the probably exact opposite. We came here to talk about how to understand and and fixed biased AI. And I’m extremely excited to bring on our guest for today. But before I do, if you’re tuning in, Why why do you think AI is biased? Get your questions in now. It’s not often we have someone that can talk about biases in AI. So I want you all who are tuning in live to get your questions answered, so make sure you get them in. And also at the same time now, help me welcome to the show.
About Nick and Solas AI
Jordan Wilson [00:04:13]:
Let’s bring him in. There we go. We have, Nick Schmidt who is the founder and CTO of, Solas AI. Nick, thank you for joining the show.
Nick Schmidt [00:04:25]:
Thank you very much, Jordan. I’m looking forward to the conversation.
Jordan Wilson [00:04:28]:
Oh, it’s gonna be a good one. I’m excited. Yes. So so real quick. Tell us Just a little bit about what Solas AI does.
Nick Schmidt [00:04:37]:
So what Solas AI does is the the ideas that algorithms can be biased. That can cause discrimination and unfairness, and, we want to find out if that’s happening and then fix it. And so what the the software does is that it goes into an algorithm, and it says, is there evidence of discrimination, and that can be defined in a number of ways. But, this is is there evidence of a problem? If there’s not, that’s great. But if there is it goes on to the next step of trying to break open the black box of the algorithm and understand what’s driving the model’s predictive quality As well as what’s driving discrimination. And from there, people can start to make decisions about what, whether to include certain things in the model or how to mitigate it, the the problem. Once you have that I idea, then you can start building alternative models that are actually less and Discriminatory. And that’s the the big part of it is that, we ultimately, have software that’s designed to search for and find fairer AI.
Jordan Wilson [00:05:46]:
Yeah. And and I’ll say this. There’s probably no shortage of work for for you all because AI models are, In my personal experience, extremely biased. So, I I mean, let’s let’s start with maybe the the why. Why why are These models that we use and without getting too technical. Right? So most generative AI systems that we use, whether they’re ChatGPT or Google BARD or MidJourney. They’re all trained. Right? So they’re all trained.
Jordan Wilson [00:06:16]:
So why are these models even biased in the 1st place?
Nick Schmidt [00:06:20]:
So, unfortunately, there are so many reasons for this, and it can really happen at any point in the in the modeling process. It can happen with the data that you’re putting into it. That can reflect historical or present discrimination. It can be around the modeling Process where you are building a model that’s says looking at whether or not you’re likely to default on a loan. And the dataset you’ve got is built in it has discrimination in in, the outcomes. People who, Got loans were discriminated against. And so once you take that algorithm and actually go out and put it into production, It can anticipate that someone is going to be discriminated against and discriminated against them in in anticipation of that. And and then finally, there’s actually using the algorithms.
Algorithm misuse can lead to discrimination
Nick Schmidt [00:07:14]:
There’s ways that you can use them that are, are discriminatory, that using building an algorithm for one purpose, and then using it for another is is Potentially highly discriminatory. And there was one example that I think is really good of a algorithm that predicted health care costs, And it’s totally reasonable for an insurance company or someone else to want to understand what someone is going likely to spend over the next, You know, year or 2 years or whatever it is. That they that way, they can find have their financial models be appropriate. But, some brilliant people figured out that, hey. We could use this algorithm that predicts future costs to predict health outcomes. And, the problem is is that after the Tuskegee experiment in that was up to the 19 seventies where, African American men were intentionally injected with syphilis by doctors in the United States. When that came out in the public, African American visits to primary care physicians dropped by 26%. What that means is that and and that had that trend has continued, by
Jordan Wilson [00:08:33]:
Nick Schmidt [00:08:33]:
And what that means is that health care spending for African Americans is much lower than that of whites or or nonblacks. And so, if you’re using prior spending to predict future health outcomes. You’re ultimately going to be really underestimating how sick black people are, And so you’re going to say that they’re not nearly as sick as they are and not give them the treatment they need.
Jordan Wilson [00:09:03]:
You know, it’s that’s a great, example that you bring up because it does seem like. You know? And I’ve even kind of seen this firsthand on more, you know, widely available models, but it seems like there is always discrimination, against certain populations of people. Like, as an example, if you ask for you know, every single week, we We put out a a recap of AI news, and we use AI image generators. So if, we’re asking for a a tech CEO, And it seems like it always almost always gives you, you know, mid forties white guy. Right? So so maybe why is there always this this maybe bias, or trends that that keep showing up in in different models against certain populations.
Nick Schmidt [00:09:48]:
It’s you know, in that example, it’s a lot of Problem is the input data. I mean, if you look at tech CEOs in the US right now, there’s a lot of White men in their forties, most likely. And so when you train the data or train the model on that data, that’s what you’re going to get out. And what that means is if you want to change those results, and and the important thing about that is is that by By changing those results, you potentially change the future. You could make a more equitable future. To change those, you have to really understand what the algorithm’s doing and make interventions in it that will make a fairer output.
Jordan Wilson [00:10:35]:
Yeah. It’s it’s a good it’s a good point. I mean, how how then, can AI models find that right Balance, right, of of, you know, kind of, maybe showcasing cultural norms, but at the same time, being Inclusive, equitable, and and and truly showing diversity that exists. You know? And maybe not just thinking of image generators. Right? I know that’s a very small use case. But even in, you know, I think if you’re using a large language model to write content and you have it write you a story, I think you’ll probably see a lot of those same trends and story lines. So How do you find or how can models find that balance?
Nick Schmidt [00:11:16]:
So there’s actually a legal framework already that I I think is really good for in the conversation, particularly academic conversation around Fair AI, people are very, Binary about it. It’s either you shouldn’t have to do anything. The data’s the data. The model’s the model. Just deal with it. Or it’s, you have to make everything completely equitable and fair. And while that would be nice, it’s Oftentimes not realistic. And there’s a legal framework within the US that’s actually quite good for defining the boundaries.
3-step burden shifting process to address discrimination
Nick Schmidt [00:11:54]:
And the idea of it, the background of it is that, and it’s called the 3 step burdens burden shifting process. And The idea of it is, as you say, is the algorithm causing discrimination. And if it’s intentional discrimination, then you have to do something about it. But if it’s unintentional discrimination, then you move on to a next step, which is, does the model have a valid business justification? If you are an artist using, generative AI, you probably have you may have a valid justification. You know? If I’m trying to write a memo, I’m not a very good writer, so I have a valid justification for using generative AI to make my my memos actually look good. But the, you know, credit model, for example, you know, which is perhaps a more realistic one. It’s like a bank Has a justification for building a good credit model. And if they’re just giving out loans randomly, They’re gonna lose money or go bankrupt.
Nick Schmidt [00:12:59]:
So that’s the 2nd stage. And then the 3rd stage, and this really gets into what you were talking about, is, The the company has the responsibility to see if it can generate a fair model that still meets their business needs. And so this breaks away from the, that dichotomy of You shouldn’t have to do anything to we have to completely, throw out the algorithm or make everything entirely equitable To just can we do better and how much can we do better. And I think that that framework is really what people should try to apply.
Jordan Wilson [00:13:38]:
Yeah. It’s some great, practical advice there for sure. So, quick quick Question here, and I love this, Mabrit. Thank you for your question. And if you do have other questions for Nick, make sure to get them in. So Mabrit asking, I wonder if it only has bias because AI uses essentially everything from the Internet. Right? And those sources have bias. And AI has a bias by making assumptions.
Jordan Wilson [00:14:02]:
Super curious how it works. So is is that maybe why it is? Because these models are Trained on essentially the entire existence of the Internet and more. Right? Is that why these biases play out in the end because the Internet is an ugly biased place?
Internet usage leads to biased data collection
Nick Schmidt [00:14:18]:
It it is definitely part of it and is probably the the main driver of it. You could even take a step back and say it’s not so much the Internet’s fault as a piece of technology, but it’s our fault as human beings, for what we put on the Internet. And I think that actually is is really important point is that we live in biased societies. So regardless of where you’re getting the data from, Those biases are likely to, get put into the data. That’s absolutely, so so that’s probably the primary source of bias, but it’s not the whole story. You can also have bias treatment because of model design, not using the right model, not using a model that’s equally accurate on multiple groups. You can also have bias come in through the usage of the model, like the example I gave earlier. So it’s really important.
Nick Schmidt [00:15:17]:
Well well, you should focus on the data going in because that’s probably the main place. You really have to look through the entire pipeline To know where the problem’s coming from.
Jordan Wilson [00:15:29]:
You know, a great, let’s hey. While we’re at it, doctor Muthana here with a with a comment, kind of a question as well. So Saying, he’s looking forward to learn the best strategies and practices to keep AI bias and risk to a bare minimum because this is a great point. Because, you know, kind of What what you all are working on, Nick, is is you’re working on big picture. Right? But for individual users, I mean, Obviously, it it depends on the on the model that we use or the large language model or whatever. But what are some maybe best practices for for users? Or can users, You know, end users, individuals do anything to help avoid this bias.
Nick Schmidt [00:16:08]:
Yeah. Absolutely. And And I think that that the first step is actually asking yourself if your model is fair for everyone. And what that means is the data that goes into it, is it reasonable data? Is it predictive of the outcome? You know, are you using month of birth year to predict whether or not you’re going to default on a loan? People do things like that all the time. They say, oh, the data is predictive. I’ll put it in. But that doesn’t that doesn’t make sense. And, and so asking yourself, are the data that are going into it okay or as as good as you can get? Who are you excluding from the model? And, are you putting in data that sort of and Collectively punishing groups of people.
Nick Schmidt [00:17:01]:
So for example, ZIP code data. Because there’s so much segregation in the United States, if you put information about where someone lives in a model, you’re going to get biased results. So asking those questions, is the 1st step, and building a good model is is the 1st step. Once you’ve got an idea that that you got fair For everyone, then you can start to think about fairness for a given group.
AI bias, accessibility, and user control insights
Jordan Wilson [00:17:30]:
Yeah. And and I think that’s a good point because there’s obviously different use cases For for AI and, you know, bias in AI goes back many decades. Right? Like like, AI is not new, I think. But for the average person, generative AI is is very new. Right? The the accessibility, the affordability, And the use cases, are are all kind of all swelling up together at the same point. But maybe even to ask that a little follow-up on that more specifically. Let’s just say for the average person. Right? If they go into chat g p t, if if if if they go into mid journey, you know, they don’t necessarily have Any control over how the model was trained, but are there any best practices or strategies, to avoid by just, you know, the everyday person using the most popular kind of large language models, is there anything we can do to Avoid that, or do we have to literally spell it out in the world’s longest prompt like, hey.
Jordan Wilson [00:18:27]:
I do not want a, b, c through through a100.
Nick Schmidt [00:18:31]:
So I I think in the in the use case of of, you know, I’m I’m going in and trying to write a a memo that’s not offensive and and and not, you know, evil, or, get other kinds of information. The The thing you have to rely on is that a lot of machine learning researchers and AI people don’t realize is that you’re still smarter than your algorithm. And you’re still smarter, and you know more and you generalize better. And so if you are Getting stuff out of a model, getting stuff out of generative AI. Use a critical eye and ask yourself. You know you know what the world looks like. You know a lot about bias. I mean, we live through this every day.
Nick Schmidt [00:19:19]:
We see this. If you’re a critical Thinker. You you probably know that. And so that’s in a in a simple use case like like Just someone using ChatCPC, it’s really about not just trusting the output. In when you get into the question Building the model or deploying the model, that’s where you can start to really get into kind of the quantitative solutions.
Jordan Wilson [00:19:46]:
You know, I’m I’m I’m curious, Nick. And if you’re if you’re just joining us midway through, thank you. Welcome. We have Nick Schmidt, The founder and CTO from Solas AI talking biases and and models. My question is, like, maybe we kinda skip to the end, but we also talk about the trying to where things are going because it seems like model development, large language models, generative AI is being developed at A breakneck pace. Right? Like, even early on, there’s, you know, thousands of, you know, CEOs and, very, recognizable people that signed a letter, and everyone said pause AI development for 6 months. And instead, we tripled down. Right? You know, big news.
Jordan Wilson [00:20:31]:
OpenAI just released their newest update to their model, GPT 4 turbo. It’s supposed to be, You know, better, faster, bigger, better at decision making, all these things. So if the big companies are just developing, faster and faster. And, I mean, I’m I’m I’m sure they’re doing things to address, their own model biases, but how can you actually fix this? Right. When the companies probably, have so much pressure and feel they just have to get out more products, bigger models, more parameters, like, How how can you, you know, possibly fix these when it’s just everyone’s just sprinting with their head down?
Nick Schmidt [00:21:10]:
Well, to make a shameless marketing plug, and, you know, buy buy, or license Solas AI software. More serious I mean, that’s true, but but more seriously, what I think is really important is that, we’ve designed software, and we’re Certainly not the only company out there, that have designed software that that can be used quickly, to mitigate these risks. And so I think that, there has been and certainly was true in the past, mitigating these risks was very difficult and time consuming. We’re actually using AI to fix AI, and it’s much faster as a result. And so you can still innovate and make fairer models. And I would say that you can actually make better models if you’re innovating and considering fairness.
Jordan Wilson [00:22:09]:
You so that was actually on my mind, Nick, because so I’m I’m glad you brought that up. Right? It’s kind of, Ironic, I guess, that you’re using AI to fix AI. I guess my question is, why aren’t the big companies doing this? Would it slow their process down? Would it slow their development down? You know, why aren’t they, you know, maybe creating a bigger, internal focus on fixing this problem before it ever goes out in the wild? Because Once it does go out, it takes companies like yours many months or multiple years, to really start to even make a dent. Right? Because everyone’s using these models. So maybe why aren’t these big companies just fixing the problem internally first?
Algorithm fairness through regulations
Nick Schmidt [00:22:59]:
So I think that they’re trying often. But there’s also there’s a big split between companies that are, are and have been doing something for a very long time and ones that are just getting started. So, actually, in the financial sector, The big banks have been working on this, and they’ve been working with us in in my consulting role for 25 years trying to make sure that algorithms are fair. And, so there really is already an established practice within, within financial services into employment. The tech companies kinda came in and said, we’re smarter than you guys. We’ll figure it out on our own. And, you know, and it’s kinda worked out how it’s worked out. And I think that What’s going to happen is there’s going to be a lot of regulation, that is going to actually move companies more and more in the direction of what financial services companies are doing.
Nick Schmidt [00:24:05]:
And so so, yes, this is a problem. And and Why isn’t it being done? Why isn’t anything being done about it? Well, there’s a little bit, but they don’t realize that there’s already a good solution out And I think it’ll that’ll change.
Jordan Wilson [00:24:20]:
Yeah. So, another another great question from our audience, and and thank you everyone for getting these questions in. So Monica asking, can model outputs change over time with users asking for different or more diverse outputs? What a great question. Nick, is that Is that a thing? Can can the outputs change?
Nick Schmidt [00:24:38]:
So it depends on if I understand the question correctly, it it depends on the type of model. Most of the work I do is in financial services and health care and other industries where the models are not dynamic. So they’re not continually updating. You build a model, you put it into production for a while, and then you rebuild it, later on. And That that kind of breaks that process of of, reinforced feedback that on an immediate scale. So in the work that I do, no. It generally doesn’t happen. But, with some of the large language models, my understanding is that there is a lot of, reinforcement and back and forth that happens.
Jordan Wilson [00:25:27]:
Yeah. That’s that’s a great point. And, also, you know, not Not to put Nick on the spot on all these things, but he’s not. You you know, I think most of us use the the chat GPTs of the world and the anthropic clods and, you know, mid journey, and and that’s not necessarily where you focus all of your time because you’re working on helping, model fairness Across the board and not just solely large language models, but, it’s it’s a good question that that Monica brings up in, you know, because I’m sure users En masse are, you know, having that conversation with the chat GPT or or with the Google bard saying, like, no. This is biased. Please reflect A more diverse output, but, I believe over time that it it it would help them, train their models to to make them more diverse and and less biased.
Algorithmic decisioning and human biases
Nick Schmidt [00:26:16]:
Yeah. And although there’s there’s actually a different way to frame, the the question that Monica has that I think is important and really good and also just shows the real benefit of moving to algorithmic decisioning, which is that when you start Modeling. You’re usually starting with a modeling a human process. So when companies first start using Underwriting models, they typically will model what their human underwriters have done. And if there are biases The decisions that human underwriters make, then those biases immediately get transformed into the algorithm. But what can happen is over time, because the algorithm is not intrinsically biased, it can see that those biases are not predictive. You know, it’s giving a loan to an African American, and they’re repaying it even though it it, you know, it assigned a higher probability of default to them or whatever. Well, so over time, what can happen is that those biases can actually be pulled out of the system.
Nick Schmidt [00:27:24]:
And so in that way, over time, the use of an algorithm can make things fair.
How to address biases in AI models?
Jordan Wilson [00:27:32]:
So, Nick, we’ve we’ve talked about a lot. We’ve talked a little bit about how models work. We’ve talked about some Ongoing issues, with biases and and stereotypes that show up over and over in these models. So so maybe as as we wrap up today’s episode, let’s just go big picture here. Like, if you had to say in a very, you know, direct way. How can we fix biased AI? Because we’ve we’ve attacked it from all different angles, but what’s the takeaway? How can’t like, aside from yeah. Like, you know, we can use Solace or or or Solace or products like this. But aside from that, how can we fix the biased AI?
Nick Schmidt [00:28:16]:
Make a start on it. Don’t don’t get overwhelmed with All of the options. Think about what your model is doing and make a decision and and test it. See whether or not it’s biased. And then in in choose a metric and go with it, and Then fix it. Try to fix it. And that’s a first step. And once you have that process in place, you can start refining it.
Nick Schmidt [00:28:49]:
Is don’t let don’t let, the inability to completely define the problem keep you from doing something about The problem.
Jordan Wilson [00:29:01]:
Such great advice. And I think that, hey. Even if if, You know, our our listeners and our viewers, even if you’re not working on a model, I think that what we talked about today is so important because it does take Users. It takes consumers, the people paying, you know, to to continue to have this conversation, To let, you you know, the the big tech companies know, yes. This is a problem. Yes. This is something that we care about. We care about fairness in our models.
Jordan Wilson [00:29:29]:
And and thank you, Nick, so much for coming on the show and helping us dive into this issue. We very much appreciate your time.
Nick Schmidt [00:29:37]:
Thank you very much, Jordan.
Jordan Wilson [00:29:38]:
And, hey, as a reminder, we did cover a lot as we always do. So make sure you go to your everyday AI.com. Sign up for the free daily newsletter. We’re gonna have a lot more information, from Nick and what we talked about today as well as just about everything else that’s always going on in In the generative AI world and more. So make sure you do that, and make sure you join us tomorrow and every day for more. Every day AI. Thanks y’all.