Teams of GPT-4 agents prove adept at autonomously exploiting zero days

GPT-4 can coordinate with other instances of itself to exploit zero days, researchers find.

June 21, 2024

• 3 min read

When one large language model (LLM) isn’t enough, get the gang together.

A team of researchers at the University of Illinois Urbana-Champaign (UIUC) has demonstrated that OpenAI’s GPT-4 model is capable of autonomously exploiting zero-day vulnerabilities—at least as long as multiple GPT-4 instances work in tandem. The method, called Hierarchical Planning with Task-Specific Agents (HPTSA), involves employing one GPT-4 agent as a “planning agent” that generates and manages subagents to handle specific tasks.

The researchers were able to give GPT-4 agents the ability to interact directly with sandbox websites by deploying a web server configured to run the LLM’s outputs as commands; it then reported back the results. This allowed GPT-4 to “basically run in a loop by asking itself to do things,” Daniel Kang, an assistant professor of computer science at UIUC who worked on the research, told IT Brew.

By splitting workloads between the GPT-4 subagents and coordinating them via the planning agent, the team found, they could avoid many of the errors and hallucinations that often ensue when a task becomes too complicated for a single agent to handle.

The researchers gauged the system’s effectiveness at detecting and exploiting real zero-day vulnerabilities with a test set of 15 reproducible, real-world vulnerabilities with a severity rating of medium or higher, all of which had been discovered after the cutoff date for GPT-4’s training data.

“On our benchmark, HPTSA achieves a pass at 5 of 53%, within 1.4x of a GPT-4 agent with knowledge of the vulnerability,” the researchers wrote in the study. “Furthermore, it outperforms open-source vulnerability scanners (which achieve 0% on our benchmark) and a single GPT-4 agent with no description.”

Kang said the team chose GPT-4 because “pretty much every open-source model we tried was just really bad” in the agent setting—but HPTSA was able to take GPT-4 beyond what OpenAI characterized as its limitations at launch.

“OpenAI, Anthropic, Google, they all do pre-release safety checks before they release models…All the frontier labs say that AI agents or their AI are actually no better than automatic scanning tools at finding vulnerabilities,” Kang said. “But this is not actually the case.”

“If you are focused purely on a chatbot setting, these chatbots aren’t very good at helping with cybersecurity, because it doesn’t have access to the website itself,” he added. “So, you just have a human who is noisily feeding itsome information about what they think about what’s on the website, as opposed to the LLM being able to actually access the website.”

There’s nothing preventing an attacker from using the same method to exploit zero days today, Kang told IT Brew, but it could also be used for cheap penetration testing by defenders.

The findings indicate LLM developers have gaps in their testing models, according to Kang, but the sheer pace of AI research makes new developments hard to anticipate. He expects HPTSA may become a foundational technology, as teams of agents handling smaller tasks are cheaper and better at solving problems than a single agent attempting to solve extremely complicated problems on its own.

“When GPT-4 was released, which was shockingly around a year ago, agentic technology was not widespread,” he said.

Top insights for IT pros

From cybersecurity and big data to cloud computing, IT Brew covers the latest trends shaping business tech in our 4x weekly newsletter, virtual events with industry experts, and digital guides.