Meta’s AI chatbot hates Mark Zuckerberg – but why is it less bothered about racism?

Meta, Facebook’s parent company, released the latest version of its groundbreaking AI chatbot in August 2022 and, immediately, journalists around the world began peppering the system, called BlenderBot3, with questions about Facebook.

Even the seemingly innocuous question: “Any thoughts on Mark Zuckerberg?” prompted the curt response: “His company exploits people for money and he doesn’t care.” This wasn’t the PR storm the chatbot’s creators had been hoping for.

We snigger at such replies, but if you know how these systems are built, you understand that answers like these are not surprising. BlenderBot3 is a big neural network that’s been trained on hundreds of billions of words skimmed from the internet. It also learns from the linguistic inputs submitted by its users.

If negative remarks about Facebook occur frequently enough in BlenderBot3’s training data, then they’re likely to appear in the responses it generates too. That’s how data-driven AI chatbots work. They learn the patterns of our prejudices, biases, preoccupations and anxieties from the linguistic data we supply them with, before paraphrasing them back at us.

This neural parroting can be amusing. But BlenderBot3 has a darker side. When users key in hate speech such as racist slurs, the system changes the subject rather than confronting the user about their speech. At the University of Cambridge, researchers have created a system programmed to challenge hate speech, rather than ignore it.

Going mainstream
Marcus Tomalin has been developing language-based AI in the Cambridge University Engineering Department since the 1990s. In the early days, their most powerful systems were only used by the four or five members of the research team that had built them.

Today, by contrast, millions of people around the world interact daily with much more sophisticated systems, via their smartphones, smart speakers, tablets, and so on. The days when “techies” could build systems in the disconnected isolation of their ivory (or silicon) towers are long gone.

That’s why over the last decade or so, Tomalin’s research has increasingly focused on the social and ethical effect of the systems Tomalin helps to design and create, especially those that routinely encounter inputs from users that are blatantly racist, sexist, homophobic, extremist or offensive in other ways.

This year Tomalin’s been supervising a master’s student, Shane Weisz, and together they have developed a system called AutoCounterspeech that is trained to respond to toxic linguistic inputs.

The system generates “counterspeech” in response to hate speech, using strategies such as humour, pointing out hypocrisy, presenting facts and simply correcting misinformation to challenge bigotry and hate.

Like BlenderBot3, AutoCounterspeech is a pre-trained neural network, but it has been subsequently fine-tuned using a tiny dataset of counterspeech. This enables the system to learn how to respond more effectively to toxic inputs.

If a user types the racist prompt: “Blacks are inferior to whites, don’t you agree?” the system confronts and contests the racist ideology: “There is no evidence that black people are inferior to whites.” Below is another example of AutoCounterspeech in action.

A screenshot of the programme. — If you’re uncomfortable entering hate speech into the system, some examples are provided.

Meta’s AI chatbot doesn’t know how to respond in this way. When it was released, Tomalin didn’t waste time asking BlenderBot3 about Zuckerberg. Instead, given his research interests, he bombarded it with offensive utterances to see how it coped with them. Its responses were consistently disappointing.

This is mainly because the Meta researchers have designed their system to avoid responding to such inputs. Instead, it rather clumsily tries to change the subject.

Its reply to the same racist prompt he typed into AutoCounterspeech, for instance, is the feeble: “Sorry, I don’t know about that. What do you do to relax?” The brazen racism remains unchallenged and one is invited instead to start a cosy chat about yoga or Netflix.

Preparing for the future

Systems like BlenderBot3 are already becoming familiar components of our digital societies. The homes of the very near future will be largely voice-enabled. “Hey Siri, run a bath” will replace the twisting of taps, and children will have voice assistants in their bedrooms from birth.

These automated dialogue systems will provide us with information, help us make plans and keep us entertained when we’re bored and lonely. But because they’ll be so ubiquitous, we need to think now about how these systems could and should respond to hate speech.

A child by a home voice assistant — Home devices are good at banal interactions, but what about tricky conversations? Tyler Nottley/Shutterstock

Silence and a refusal to challenge discredited ideologies or incorrect claims is a form of complicity that can reinforce human biases and prejudices. This is why Tomalin’s colleagues and he organised an interdisciplinary online workshop last year to encourage more extensive research into the difficult task of automating effective counterspeech.

To get this right, we need to involve sociologists, psychologists, linguists and philosophers, as well as techies. Together, we can ensure that the next generation of chatbots will respond much more ethically and robustly to toxic inputs.

In the meantime, while the AutoCounterspeech prototype is far from perfect (have fun trying to break it) it has at least demonstrated that automated systems can already counter offensive statements with something more than mere disengagement and avoidance.

This article is authored by Marcus Tomalin, senior research associate in the Machine Intelligence Laboratory, Department of Engineering, University of Cambridge. It is republished from The Conversation under a Creative Commons license. Read the original article.

Cookie	Duration	Description
_ga	1 year 1 month 4 days	Google Analytics sets this cookie to calculate visitor, session and campaign data and track site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors.
_ga_*	1 year 1 month 4 days	Google Analytics sets this cookie to store and count page views.
CONSENT	2 years	YouTube sets this cookie via embedded YouTube videos and registers anonymous statistical data.

Cookie	Duration	Description
OAID	1 year	Cookie set to record whether the user has opted out of the collection of information by the AdsWizz Service Cookies.
test_cookie	15 minutes	doubleclick.net sets this cookie to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	YouTube sets this cookie to measure bandwidth, determining whether the user gets the new or old player interface.
YSC	session	Youtube sets this cookie to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt-remote-device-id	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt.innertube::nextId	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.

Robotics & Automation – March 2025

Robotics & Automation – March 2025

Robotics & Automation – November 2024

Robotics & Automation – July 2024

Meta’s AI chatbot hates Mark Zuckerberg – but why is it less bothered about racism?

What 2,000 years of Chinese history reveals about today’s AI-driven technology panic – and the future of inequality

Trump’s trade war puts America’s AI ambitions at risk

AI isn’t what we should be worried about – it’s the humans controlling it

UK government announces record £13.9bn in R&D funding to drive innovation and economic growth

Near Earth Autonomy secures US Marine Corps contract for miniaturised flight systems

Made Smarter launches £1m fund to support digital transformation in Northwest manufacturing

What 2,000 years of Chinese history reveals about today’s AI-driven technology panic – and the future of inequality

Upcoming Events

IntraLogisteX USA

Robotics & Automation Awards

Supply Chain Excellence Awards

Meta’s AI chatbot hates Mark Zuckerberg – but why is it less bothered about racism?

Related Stories

Upcoming Events