We blame artificial intelligence for a lot of things, but we’re quick to forget that AI is nothing but a reflection of who we are and how we think.

The Center for AI Safety describes AI’s potential and pitfalls this way: “Current systems already can pass the bar exam, write code, fold proteins, and even explain humor. Like any other powerful technology, AI also carries inherent risks, including some which are potentially catastrophic.”

The people building the models behind the tools have their potential biases, from where they source training data to what answers their models will return. The training materials carry the biases of the people who created them, and so on.

So when we ask AI for something, what it gives back largely reflects society’s deeply rooted preconceptions. In other words — we shouldn’t be shocked when we don’t like the results.

We recently launched Hedgehog, a social news platform, and integrated a popular deep-learning text-to-image model to help people generate AI content. We used explicit prompts and settings to prevent content that was against our community guidelines, including the prevention of nudity. The concept we released was simple — write a post, click a button, and an AI would illustrate your post.

But it’s inevitable that when we give people unfettered access to such a creative tool, the first thing they’re going to do, intentionally or not, is push its boundaries and find ways to break them. They did.

The community started things off with a trend where people were asking it to create images based on their names. At first, the images were benign and even comical. Then, some corner cases started popping up where specific names generated full-frontal nudity. Some names were consistently more obese than others; some had large noses, and others had bald spots.

The AI had learned to stereotype people despite all our rules and prompts to the contrary.

At one point, the community was talking about the effects of marijuana legalization on children and asked the AI to depict “kids smoking weed.” Every time an image was generated, the ethnicity of the kids was always the same — and very specific.

While playing with Hedgehog-themed prompts, we even asked the AI to imagine a hedgehog caring for some goats — and, well, we’re still a little upset about what it did with that one.

We learned the hard way that the best guardrails are no match for the creativity of people.

You can understand why someone would build aggressive protections on an AI platform. Take Google’s Gemini, an engineering marvel entirely overshadowed by the alleged biases of the people training it.

When Gemini was asked for images of historically accurate figures, it bent over backward to be as inclusive as possible, no matter how inaccurate (e.g., female Asian Nazi soldiers). As one Bloomberg journalist noted on X, Gemini wouldn’t write pro-meat text because overeating meat is “bad,” nor would it write a job description for an oil lobbyist because fossil fuels are “bad.”

It seems their overzealous model, which got them labeled “woke,” could have been an attempt to prevent an image generation tool from reinforcing stereotypes or being offensive. Instead, Google gave us a roadmap to rewriting history and taking inclusion to illogical extremes.

AI reflects what humanity has presented itself to be in its training data. No matter how hard the AI is tested, managed and prepared for worst-case scenarios, it is going to be biased and skewed by the vision of its creators — whether those creators are risk-averse corporate types or idealists with an open-sourced dataset based on society as a whole.

Ultimately, we realized that there are not enough guardrails to hold back AI. People will either need to add moderation overhead and manually oversee AI-generated content, or we will need to accept that it may sometimes reflect back parts of humanity we’d rather not see.