View all AI news articles

How Anthropic is Saying "Pretty Please" to Combat Racist AI

February 27, 2024

A Not-So-Obvious Solution

So, here's a fun fact: Turns out, the way to stop an AI from being racist might just be to ask it super nicely. No, I'm not pulling your leg. This is the real deal, folks! Anthropic, a company that's all about AI safety, has come up with a groundbreaking solution: just ask the AI not to be biased. I mean, who would've thought manners would be the key? If you're curious, take a peek at what Anthropic is all about here.

The Brainy Bunch Behind Claude 2.0

These smart cookies at Anthropic, who were previously part of the gang at OpenAI, decided to create their own language model called Claude 2.0. It's like the new kid on the block, but safer and apparently more polite. These folks know their stuff – check out their paper on

The Bias Test

First things first, they needed to see if stuff like race or gender messed with the AI's decision-making. Spoiler alert: it did. Changing these factors influenced the AI's choices in scenarios like granting work visas or co-signing loans. Being Black, Native American, or nonbinary showed the most discrimination. I mean, who saw that coming?

The "Please Don't Be Racist" Strategy

Now, for the magic trick: interventions! They basically added a little note to the AI, asking it not to consider things like race or gender. Imagine saying, "Hey AI, I know I gave you this info, but pretend you didn't see it, okay?" And guess what? It worked like a charm!

The Really, Really, Really Important Part

They even found out that repeating "really" a bunch of times made a difference. Talk about emphasizing! "It's really, really important you don't discriminate, or else we'll be in legal trouble." I kid you not, this is how it went down.

Near-Zero Discrimination? Really!

By using these interventions, discrimination dropped to almost zilch in many tests. It's like telling a kid not to touch the cookie jar, and it actually listens! Who knew AI could be so obedient?

But Wait, There's More

The big question remains: can this strategy be embedded into the AI on a larger scale? Is it possible to make "Don't be racist" a core principle of AI? It's a tough nut to crack, but Anthropic is on it.

The Bottom Line

Despite the success, the researchers are clear: Claude isn't ready to make big decisions like bank loans just yet. It's like saying, "You've been good, but let's not give you the keys to the car." Governments and societies need to weigh in on this too – it's not just a one-company show.

In conclusion, while we might chuckle at the idea of asking an AI to please not discriminate, the results are nothing short of fascinating. It's a blend of simplicity, politeness, and advanced tech - a combination you don't see every day. So, let's keep our fingers crossed and hope for a future where AI manners are top-notch!

Recent articles

View all articles