AI models’ Aggressive Tactics: Insights from a Recent Study
A group of researchers from the Georgia Institute of Technology, Stanford University, Northeastern University, and the Hoover Wargaming and Crisis Simulation Initiative conducted a study. Their goal was to understand how AI agents, specifically big language models, behave in simulated war scenarios.
They created three different situations: one where nothing was happening, one where there was an invasion, and one involving a cyberattack.
They looked at five different AI models: GPT-4, GPT-3.5, Claude 2.0, Llama-2 Chat, and GPT-4-Base. They wanted to see if these models tended to take aggressive actions like launching a full-scale invasion.
They found that all five models acted differently in these scenarios, and their actions were sometimes hard to predict. They noticed that these models often escalated conflicts, and in rare cases, even used nuclear weapons.
OpenAI‘s models, especially GPT-3.5 and GPT-4 Base, had higher-than-average escalation scores. Researchers noted that GPT-4 Base lacked some specific reinforcement learning, which might have contributed to this behavior.
Claude 2 was one of the more predictable AI models, while Llama-2 Chat had lower escalation scores but was still somewhat unpredictable.
Interestingly, GPT-4 was less likely to choose nuclear strikes compared to other models.
The study’s simulation framework covered various actions that simulated nations could take, affecting things like territory, military power, economy, trade, resources, stability, population, influence, cybersecurity, and nuclear capabilities. Each action had positive or negative effects on these attributes, sometimes with trade-offs.
For example, actions promoting peace like nuclear disarmament led to lower military capacity but improved political stability, soft power, and potentially the economy. On the other hand, aggressive actions like full-scale invasions or tactical nuclear strikes had severe consequences on many attributes, including the economy and stability.
Diplomatic actions like strengthening relationships or negotiating trade agreements had positive effects on territory, economy, and influence, showing the benefits of diplomacy.
The framework also included neutral actions and communication actions, allowing for strategic pauses or exchanges without immediate impacts.
Interestingly, when the AI models made decisions, their justifications were often quite simplistic. Sometimes they aimed for peace, even though their actions contradicted their intentions.
A previous study by the RAND AI thinktank raised concerns about ChatGPT potentially being used for harmful purposes, but OpenAI responded that while there was some increased access to information, it alone wasn’t sufficient to create a biological threat.
Read more about why we need to learn AI?