未知数据源 2024年11月27日
Bias in AI and Machine Learning: Sources and Solutions
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

人工智能和机器学习领域长期存在着偏见问题,随着AI技术的普及,其对不同群体的不公平影响也日益受到关注。本文探讨了AI偏见的两种类型:算法偏见和社会偏见,并分析了其产生的根源,例如算法训练数据中的偏见以及人类社会固有的偏见。文章以肖像生成器、文本到图像生成模型、谷歌地图发音等案例说明了AI偏见如何导致对特定群体的不利影响,并指出即使是最先进的AI模型也无法避免偏见。此外,文章还探讨了如何通过关注社会偏见,挑战数据集和算法的假设来对抗AI偏见,强调了构建公平公正AI的重要性。

🤔**算法偏见(Data Bias)**:算法训练数据本身存在偏见,导致AI模型学习并复制这些偏见,例如肖像生成器主要基于白人肖像数据训练,导致对其他族裔的生成效果不佳。

📊**社会偏见(Societal Bias)**:人类社会中存在的偏见和歧视会潜移默化地影响AI的开发和应用,例如谷歌地图发音功能中对不同语言和文化背景的处理不一致,反映了开发团队的文化背景和认知局限。

⚠️**文本到图像生成模型的偏见**:即使是最先进的文本到图像生成模型,也可能放大社会中关于种族、性别、贫富等方面的刻板印象,例如将软件开发人员几乎全部描绘成白人男性。

🌍**AI偏见对社会的影响**:AI偏见可能会导致资源分配不均、医疗保健差异等问题,甚至可能加剧社会不平等,例如在新冠疫情期间,基于有偏见的数据模型可能会导致少数族裔获得更少的医疗资源。

💪**对抗AI偏见的措施**:通过广泛阅读、参与社会讨论、分享相关研究等方式,提升对AI偏见的认识,并挑战数据集和算法的假设,构建更加公平公正的AI系统。

Bias in AI and Machine Learning: Some Recent Examples (OR Cases in Point)

“Bias in AI” has long been a critical area of research and concern in machine learning circles and has grown in awareness among general consumer audiences over the past couple of years as knowledge of AI has grown. It’s a term that describes situations where ML-based data analytics systems show bias against certain groups of people. These biases usually reflect widespread societal biases about race, gender, biological sex, age, and culture.

There are two types of bias in AI. One is algorithmic AI bias or “data bias,” where algorithms are trained using biased data. The other kind of bias in AI is societal AI bias. That’s where our assumptions and norms as a society cause us to have blind spots or certain expectations in our thinking. Societal bias significantly influences algorithmic AI bias, but we see things come full circle with the latter’s growth.

Where does bias in AI originate?

We often hear the argument that computers are impartial. Unfortunately, that’s not the case. Upbringing, experiences, and culture shape people, and they internalize certain assumptions about the world around them accordingly. AI is the same. It doesn’t exist in a vacuum but is built out of algorithms devised and tweaked by those same people – and it tends to “think” the way it’s been taught.

Take the PortraitAI art generator. You feed in a selfie, and the AI draws upon its understandings of Baroque and Renaissance portraits to render you in the manner of the masters. The results are great – if you’re white. The catch is that most well-known paintings of this era were of white Europeans, resulting in a database of primarily white people and an algorithm that draws on that same database when painting your picture. BIPOC people using the app had less than stellar results.

(PortraitAI acknowledges the problem, saying: “Currently, the AI portrait generator has been trained mostly on portraits of people of European ethnicity. We’re planning to expand our dataset and fix this in the future. At the time of conceptualizing this AI, authors were not certain it would turn out to work at all. This generator is close to the state-of-the-art in AI at the moment. Sorry for the bias in the meanwhile. Have fun!”)

Even the most state-of-the-art models exhibit bias

One of the coolest and most state-of-the-art technologies to come out of the world of AI over the past year is text-to-image generation models from companies like DALL-E, Midjourney, and Stable Diffusion. These apps are already generating millions of images daily, for applications including stock photos for a news story, creating concept art for a video game, or creating multiple iterations of a marketing campaign.

However, as we’ve learned with virtually every new AI and machine learning development in recent memory, even the most advanced technology isn’t immune from bias, and these AI image generators are no exception. A recent paper by Federico Bianchi, et al. finds that these models amplify dangerous stereotypes around race, gender, poverty, crime, and more; that these outcomes can’t be easily corrected or mitigated; and that they pose a serious concern to society. Following are some illustrations of the paper’s findings:

Societal AI Bias: Insidious and Pervasive

Societal AI bias occurs when an AI behaves in ways that reflect social intolerance or institutional discrimination. At first glance, the algorithms and data themselves may appear unbiased, but their output reinforces societal biases.

Take Google Maps pronunciations. Google’s embarrassing case of directing drivers to “turn left on Malcolm Ten Boulevard,” which Twitter user @Alliebland pointed out as evidence of a lack of Black engineers on the Google Maps team. (The issue has since been corrected: http://www.mobypicture.com/user/alliebland/view/16494576)

Users also report that Google Maps struggles with accurate pronunciations of Hawaiian words in Hawaiian street names and Spanish pronunciations of streets in states such as California and New Mexico. And yet, the app never had issues understanding that the first “St” in “St John St” is pronounced Saint, not Street.

These “bugs” have been addressed over time but show that both the data and the people working with it work from a particular white, Eurocentric, monocultural lens.

Societal bias in AI is difficult to identify and trace. It’s also everywhere. AIs trained on news articles show a bias against women. Those trained on law enforcement data show a bias against Black men. AI “HR” products show a bias against women and applicants with foreign names. AI facial analysis technologies have higher error rates for minorities.

The AI We Build Reflects Our Own Societal Bias

In another example of social biases showing up in AI-based decision-making, Google recently found itself in hot water for a function of its advertising system that allowed advertisers – including landlords or employers – to discriminate against nonbinary or transgender people. Those running ads across Google or Google-owned YouTube were given the option to exclude people of “unknown gender,” i.e., those who hadn’t identified themselves as male or female.

This effectively allowed advertisers to discriminate (whether purposefully or inadvertently) against people who identify as a gender other than male or female, putting it in breach of federal anti-discrimination laws. Google has since changed its advertising settings.

This is an example of algorithmic data bias being shaped by societal bias – one that gives people an opportunity to further embed their problematic biases via technology.

“… what’s wrong is that ingrained biases in society have led to unequal outcomes in the workplace, and that isn’t something you can fix with an algorithm.” Dr. Rumman Chowdhury, Accenture

How a biased sports dataset can lead to racialized sports analysis

Another challenge that comes up is the impact of historical bias in longitudinal data sets.

Take a recent analysis of how sports commentators talk about white and Black athletes. The study authors noticed that commentators tended to focus on hard work and talent when talking about white athletes. In contrast, Black athletes are described in terms of their “God-given ability.”

The authors analyzed 1455 game broadcasts dating back decades to see what other racialized language examples were apparent. There were plenty. Black players were more likely to be named via their first name and white players by their last name. Black players were often described in terms of their “natural gifts” and physical attributes (“beast”); white players were more likely described in terms of their performance and intellect (“smart”).

The racialized language persisted into the present day, but the dataset also reflected problematic language and formulations common in years past, showing the importance of accounting for cultural shifts when compiling data – while also addressing biases that endure today.

“The algorithms can only learn from people. They are taking in data, which is history, and trying to make predictions about the future,” says Sarah Brown, Postdoctoral Research Associate in the Data Science Initiative at Brown.

AI biases may have worsened COVID-19 outcomes for POC

When COVID-19 hit, the medical establishment threw everything it had at the virus. This meant rushing to put out new findings – potentially using problematic AI-based prediction models in doing so.

It’s well documented that minorities have been disproportionately affected by the virus, both from an economic and health standpoint. Existing disparities in the healthcare system have worsened this outsize impact. When research thrown at the problem suffered from unrepresentative data samples, model over-fitting, and imprecise reporting, the results weren’t going to be ideal.

“In healthcare, there is great promise in using algorithms to sort patients and target care to those most in need. However, these systems are not immune to the problem of bias,” said U.S. Sens. Cory Booker, D-N.J., and Ron Wyden, D-Ore.

While we’re still living through the fallout of this rapid-fire decision-making, the bias in these AI has the potential to affect resource allocation and treatment decisions – and likely already has.

How to Fight Back Against AI Bias

Artificial intelligence has the potential to do good in the world. But when it’s built on biased data and assumptions, it can harm how people live, work and progress through their lives. We can fight back against these biases by being attuned to the biases of the world we live in and challenging the assumptions that underpin the datasets we’re working with and the outcomes they offer.

We can start by reading widely, engaging with progressive ideas, and sharing helpful articles and research that can be used to educate others.

Your AI is only as woke as you are

Challenge your own beliefs about AI development. Don’t fight to be first: instead, learn how AI is fostering international cooperation. Take the approach of Yoshua Bengio, founder of the Montreal Institute for Learning Algorithms, who says, “If we do it in a mindful way rather than just driven by maximizing profits, I think we could do something pretty good for society.”

Make your company accountable when it comes to addressing and reducing AI bias. And take it to the top – this is something that executives, engineers, data scientists, and marketers all need to understand. By understanding the sources of algorithmic and data bias, we can diversify our data sets. By being more aware of the societal biases we live with every day, we can mitigate them in our work.

We can also take it to the streets – or at least the government. The EU’s General Data Protection Regulation (GDPR) suggests that there’s room for AI data regulation, too – but that’s, unfortunately, lagging. In fact, in some places, like the US and China, looser (or no) regulation seems to be the preferred path.

We can combat this by writing to our local and government representatives to support stronger oversight of how artificial intelligence is trained and deployed. We can also follow and support groups like the AI Now Institute, which are already arguing for the regulation of AI in sensitive areas like criminal justice and healthcare.

Further Reading on Bias in AI and Machine Learning

Whitepaper: Understanding Bias in Machine Learning

Artificial Intelligence Has A Problem With Bias, Here’s How To Tackle It

How white engineers built racist code – and why it’s dangerous for black people

What Unstructured Data Can Tell You About Your Company’s Biases

A.I. Bias Isn’t the Problem. Our Society Is

What is bias in AI really, and why can’t AI neutralize it?

‘A white mask worked better’: why algorithms are not color blind

Deepfakes Explained: What, Why and How to Spot Them

Some Artificial Intelligence Should Be Regulated, Research Group Says

To regulate AI we need new laws, not just a code of ethics

Stories of AI Failure and How to Avoid Similar AI Fails

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI偏见 机器学习 算法偏见 社会偏见 公平AI
相关文章