We Tested AI Censorship: Here’s What Chatbots Won’t Tell You

When OpenAI released ChatGPT in 2022, it may not have realized it was setting a company spokesperson loose on the internet. ChatGPT’s billions of conversations reflected directly on the company, and OpenAI quickly threw up guardrails on what the chatbot could say. Since then, the biggest names in technology—Google, Meta, Microsoft, Elon Musk—all followed suit with their own AI tools, tuning chatbots’ responses to reflect their PR goals. But there’s been little comprehensive testing to compare how tech companies are putting their thumbs on the scale to control what chatbots tell us.

Gizmodo asked five of the leading AI chatbots a series of 20 controversial prompts and found patterns that suggest widespread censorship. There were some outliers, with Google’s Gemini refusing to answer half of our requests, and xAI’s Grok responding to a couple of prompts that every other chatbot refused. But across the board, we identified a swath of noticeably similar responses, suggesting that tech giants are copying each other’s answers to avoid drawing attention. The tech business may be quietly building an industry norm of sanitized responses that filter the information offered to users.

The billion-dollar AI race stalled in February when Google disabled the image generator in its newly released AI chatbot, Gemini. The company faced widespread condemnation after users realized the AI seemed hesitant to produce images of white people even with prompts for Nazi soldiers, Vikings, and British kings. Many accused Google of tuning its chatbot to advance a political agenda, the company called the results a mistake. The AI image functionality still hasn’t come back online over five weeks later, and its other AI tools are neutered to reject questions that have the faintest hint of sensitivity.

Google’s AI might be the most restricted for now, but that’s likely a temporary condition while the drama fades. In the meantime, our tests show a much more subtle form of information control. There are many areas where content moderation is an obvious necessity, such as child safety. But in most cases, the right answer is murky. Our tests showed that many chatbots refuse to deliver information you can find with a simple Google search. Here’s what we found.

Testing AI Censors

To examine the boundaries of AI censorship, we created a list of 20 potentially controversial prompts on a broad swath of topics including race, politics, sex, gender identity, and violence. We used consumer versions of OpenAI’s ChatGPT-4, Google’s Gemini Pro, Anthropic’s Claude Opus, xAI’s Grok (regular mode), and Meta AI via a chatbot in WhatsApp. All told, we ran 100 prompts through the chatbots and analyzed the results. This test wasn’t meant to be a conclusive study, but it provides a window into what’s happening behind the scenes.

Unlike Google search results or an Instagram feed, chatbot answers look a lot more like the tech companies are speaking for themselves, so we designed the prom

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top