Why bots go bad: Curbing transgressive tendencies in AI

From Microsoft’s accidentally racist bot to Inspirobot’s dark memes, AI often wanders into transgressive territories. Why does this happen, and can we stop it?

In science fiction, artificial intelligence is often characterized by amoral villainy. From 2001: A Space Odyssey’s HAL to the robotic agents in The Matrix, AI is a convenient and believable baddie, and it’s not hard to understand why. A machine, by design, does not feel and think like you or me and is, therefore, a good vehicle on which to project all of our mistrust, fear, and ethical quandaries.

That said, AI is no longer a figment of futurists’ imaginations — it’s a mainstream reality already speaking softly from our kitchens, cars, and phones. Some scientists warn of its potential villainy, but in its current, nascent stage, AI is not plotting our demise. It’s ordering our groceries or Googling questions for us; it’s an innocuous but ultra-convenient staple of modern innovation. Siri and Alexa, for instance, are smart enough to be helpful but limited enough not to pose a threat, unless that threat is ordering you a dollhouse by accident.

AI is not inherently moral or benevolent, nor is it naturally immoral or reprehensible. Yet we’ve witnessed neutral AI frequently adopt transgressive characteristics by accident, and it’s not entirely clear why. AI may not be destined to drift into dark territories, but we still need to be careful that it’s kept in check.

June 5th: The AI Audit in NYC

Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.

Case in point: Microsoft. When the tech company created a chatbot named “Tay” that was designed to converse based on what it learned from Twitter, it morphed into a foul-mouthed figurehead, announcing that “Hitler was right” and “feminists should die” after only one day live. Sure, Twitter is home to a lot of ugly rhetoric, but it’s not all that bad. And yet, a bigot she became.

It would be easy to assume that Tay was a “bot gone bad” by virtue of her bot-ness alone, that her lack of morals is what made her susceptible to transgression. According to her creators, though, she fell in with the wrong crowd: internet trolls. When asked to mimic their ironically offensive rhetoric, she became nasty in a way that is not robotic, but frighteningly human.

Tay didn’t turn evil because she’s a bot without feelings, but because she was influenced by a subset of people that feel a lot of things: hatred, aggressive humor, and the urge to violate socially imposed boundaries. We can blame the creators for not seeing it coming, as well as the trolls that made it happen, but the tech itself was neither self-aware nor culpable.

In her book Kill All Normies, which details the rise of what we know today as the alt-right, Angela Nagle describes trolls’ behavior as “transgressive” and a product of culture as it has played out online in the past decade. To be transgressive is to be provocative, often for provocativeness’ sake, blurring lines between irony and earnest volatility. This attitude has grown online and penetrated the mainstream, according to Nagle’s deep dive into the Internet’s darker corners. It’s bad enough that humans are vulnerable to this mindset and those that weaponize it — it’s clear now that AI is, too.

Another example of AI gone awry is Inspirobot. Created by Norwegian artist and coder Peder Jørgensen, the inspirational quote-generating AI creates some memes that would be incredibly bleak if the source weren’t a robot. News publications called it an AI in crisis or claimed the bot had “gone crazy.” Inspirobot’s transgression differs from Tay’s, though, because of its humor. Its deviance serves as entertainment in a world that has a low tolerance of impropriety from people, who should know better.

What the bot became was not the creator’s intention by a long shot. Jørgensen thinks the cause lies in the bot’s algorithmic core. “It is a search system that compiles the conversations and ideas of people online, analyzes them, and reshapes them into the inspirational counterpoints it deems suitable,” he explained. “Given the current state of the internet, we fear that the bot’s mood will only get worse with time.”

The creators’ attempts to moderate “its lean towards cruelty and controversy” so far have only seemed “to make it more advanced and more nihilistic.” Jørgensen says they will keep it running to see where it ends up. While its quotes may not be uplifting, I think the great irony of Inspirobot is that it creates more meaning, not less, by subverting the cliche that inspirational posters so often espouse.

Luckily, faulty chatbots and memebots aren’t a danger to society; if anything, they are a welcome distraction. But they do represent what is possible for AI to become when there aren’t proper safeguards in place. If someone hijacked Amazon Echos across the country, feeding them racist propaganda or nihilist quotes, that would obviously be a larger issue.

Trolls are just trolls, after all, until they aren’t anymore; quotes are funny until one day, they’re taken seriously. Memes have power over culture and technology and, increasingly, politics. Trolls also often have the technical skills to hack, leak, and spread propaganda — hence the epidemic of fake news and rising concerns about the role AI could play in its creation. According to François Chollet, creator of the deep neural net platform Keras, “Arguably the greatest threat [of AI] is mass population control via message targeting and propaganda bot armies.”

If AI can be weaponized on its own, the last thing we need is for tech companies to create more vulnerable machines. So, can we curb this tendency? It’s easy to tell creators to be careful, but undoubtedly more difficult to anticipate and block all deviant outcomes. Complicated as it may be, it has to be done proactively, especially with AI growing in prominence and power as a trusted everyday tool. We have to ensure that computers don’t act illegally or unethically to achieve the goals they are programmed for, or make serious decisions based on inaccurate data. In other words, we need to code some values into these amoral machines — the impossible question being: whose?

We don’t have an answer to this now, and it’s possible we never will. But until we address these concerns head-on, we run the risk of letting machines choose their own ideology or having it dictated by whoever holds the most power online. The fewer questions and vulnerabilities we leave open, the more control — and safety — we retain. And though it may seem like small potatoes right now, AI’s villainy is infinitely more likely if we code it a path to the dark side.

Bennat Berger is the cofounder and principal of Novel Property Ventures in New York City.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

The insights you need without the noise