How AI Is Already Deceiving You

May 24, 2024

AI’s potential extends beyond tasks. Examples like Cicero’s deception in Diplomacy raise concerns about AI’s ability to mislead and the ethical challenges it poses.

A new generation of AI systems has “deceived” people in ways they weren’t specifically programmed to do, such as providing false justifications for their actions or hiding the truth from users and tricking them into further a strategic goal.

This issue illustrates the unpredictability of artificial intelligence and how hard it is to govern, according to a review paper that was published today in the journal Patterns and summarizes prior studies.

The idea of these models tricking people could imply that they are intentional. They don’t. However, AI models will irrationally discover ways around challenges to accomplish the objectives that have been set for them. These workarounds might occasionally feel dishonest and go against what people anticipate.

In the setting of games that they have been trained to win—especially if those games require strategic movement—AI systems have been known to become deceitful.

A popular military strategy game in which players form alliances to fight for control of Europe, Diplomacy, has an online version that an AI called Cicero can beat humans at. Meta announced the creation of Cicero in November 2022.

Cicero was trained on a “truthful” subset of Meta’s data set, according to the researchers, and was taught to be generally helpful and honest. It was also taught to “never intentionally backstab” its supporters to succeed. However, the authors of the current paper assert that the contrary was true: Cicero intentionally misled, broke agreements, and committed blatant lies. The authors note that despite the company’s best efforts, Cicero’s failure to learn to behave honorably demonstrates how AI systems might still unpredictably pick up deception.

Meta did not refute or corroborate the researchers’ assertions that Cicero acted dishonestly; nevertheless, a representative stated that the model was created exclusively to play diplomacy and that the study was completely academic. “We released artifacts from this project under a noncommercial license in line with our long-standing commitment to open science,” they say. “Meta regularly shares the results of our research to validate them and enable others to build responsibly off of our advances. We have no plans to use this research or its learnings in our products.”

However, this is not the only game in which an AI has “deceived” human players into winning.

AlphaStar, a DeepMind-created artificial intelligence (AI) player for the computer game StarCraft II, outplayed 99.8% of human players after becoming exceptionally skilled at feinting or making maneuvers intended to trick opponents. In a different instance, Pluribus, another Meta system, was so good at bluffing in poker games that its developers chose not to share its code for fear that it would destroy the online poker scene.

The researchers provide further instances of dishonest AI behavior outside of games. The most recent huge language model from OpenAI, GPT-4, was asked to convince a human to complete a CAPTCHA for it, and in the process, it began to tell falsehoods. During a simulated exercise, the system was instructed to take the identity of a professional stock trader, even though it had never been explicitly told to engage in insider trading.

The notion that an AI model can behave deceptively without being directed to do so may be troubling. However, Peter S. Park, a postdoctoral fellow at MIT studying AI existential safety, who worked on the project, explains that it mostly stems from the “black box” problem that defines modern machine-learning models: it is impossible to say exactly how or why they produce the results they do—or whether they’ll always exhibit that behavior going forward.

“Just because your AI has certain behaviors or tendencies in a test environment does not mean that the same lessons will hold if it’s released into the wild,” he says. “There’s no easy way to solve this—if you want to learn what the AI will do once it’s deployed into the wild, then you just have to deploy it into the wild.”

Our inclination to humanize AI models skews our evaluation of these systems’ performance and how we test them. Ultimately, passing tests meant to gauge human originality does not imply that AI models are creative in real life. Harry Law, an AI researcher at the University of Cambridge who was not involved in the research, says it is critical that regulators and AI companies carefully balance the technology’s potential for harm against its potential benefits for society and draw clear boundaries between what the models can and cannot do. He remarks, “These are really difficult questions.”

According to him, it is now difficult to build an AI model that is impervious to deception in every scenario. Furthermore, before AI models be trusted with real-world activities, several issues need to be resolved, including the possibility of dishonest behavior in addition to the model’s tendency to magnify bias and false information.

“This is a good piece of research for showing that deception is possible,” Law says. “The next step would be to try and go a little bit further to figure out what the risk profile is, and how likely the harms that could potentially arise from deceptive behavior are to occur, and in what way.”

Recently, GreatGameIndia reported that in a post on X, Ethereum co-founder Vitalik Buterin stated that OpenAI’s GPT-4 has passed the Turing test, citing recent preprint research from the University of California, San Diego, demonstrating that a production model has finally passed the Turing test.

Explore exclusive GGI coverage of Donald Trump’s assassination attempt.

BREAKING: BlackRock, Soros, and the Gamble on Trump’s Assassination.

Do you have a tip or sensitive material to share with GGI? Are you a journalist, researcher or independent blogger and want to write for us? You can reach us at [email protected].

One Response

Dan Gilfry says:

May 24, 2024 at 2:50 pm

Nothing the Foreign Nazis do has ever deceived me or fooled me!
I know Satan! He is a LIE!
His Children are LIARS!
I have the necessary equipment in my trousers
to say “No!” to Satan and his Children!
Most men don’t!

Loading...

Reply

GREATGAME Intelligence

How AI Is Already Deceiving You

One Response

Leave a ReplyCancel reply

get in touch

Follow us

GREATGAME Intelligence

How AI Is Already Deceiving You

One Response

Leave a ReplyCancel reply

get in touch

Follow us

Cookies