White House sends hackers against the most powerful AIs

For the Biden administration, a question hangs over the biggest tech development in years: How big a security threat is it?

White House sends hackers against the most powerful AIs

On Friday, in hotels across Las Vegas, some of the world's most powerful artificial intelligence systems will come under simultaneous attack by a small army of hackers trying to find their hidden flaws.

The White House is not only aware of the public assault — it’s endorsing it.

In May, the Biden administration threw its support behind a deliberate, coordinated test attack on AI systems, called red-teaming, set to play out over three days at an annual hacker convention this weekend. Several leading AI companies, including OpenAI, Google and Meta, agreed to have some of their latest and most powerful AI systems attacked for the exercise.

The hacker attack highlights what has become one of the White House’s key concerns about the powerful, fast-growing new AI models: How secure they really are, and whether they could pose a threat either to American citizens, or to national security on the global stage.

“Our framing — and this comes from the president — is that to to harness the opportunities of AI, we first need to manage the risks, too,” said Alan Mislove, a senior official at the White House Office of Science and Technology Policy who helped the hacking challenge organizers develop this weekend’s red teaming exercises. “For things like large language models, those risks are quite broad, in many cases can be less clear than other systems,” and “cover our society, our economy, national security,” he said.

As Congress struggles to pin down what new laws to pass on AI, and federal agencies flex their existing authorities over an emerging technology, the Biden White House has emerged as the most active player on AI policy. It has drafted an AI Bill of Rights framework, convened tech CEOs, and held a series of press conferences on the wide range of threats and opportunities presented by the technology.

Though these threats range across society, from job loss to discrimination to misinformation, many of the White House’s most tangible steps have focused on the security issue. Its new special adviser for AI, Ben Buchanan, has a national security rather than a technical background. When the White House convened AI leaders to announce a set of voluntary commitments last month, “safety” topped the list, and security played a key role through the document.

The high priority on security reflects the anxiety — among experts, regulators and the industry itself — that the complex new AI systems present a range of new issues not fully understood, from their potential to be hacked and misdirected by an adversary, to the idea that they could expose user data, to darker uses like building bioweapons.

“It's possible to get these models to do things that their designers and vendors do not anticipate or do not want them to be able to do. So yes, I think there are real security considerations,” said Arati Prabhakar, director of the White House’s Office of Science of Technology Policy.

AI can also be a tool for improving security: This week the Pentagon announced a two-year challenge for developers to use AI to harden critical American cybersecurity.

For this weekend’s red-teaming challenge, the White House partnered with the AI Village at DEFCON, an annual convention where organizers stage hacking wargames and cybersecurity professionals reveal the latest holes in ubiquitous technologies. Government agencies like the Pentagon have turned to the hacker community to find cybersecurity vulnerabilities before: At a DEFCON hacking challenge last year, a participant found a disabling flaw in the army’s electrical microgrid after feeding it false weather data.

But this year’s version is unusual for both the level of government buy-in and industry participation. Companies in the tech industry have traditionally been reluctant to expose proprietary software to public view for testing. But this year, urged by the White House, tech companies OpenAI, Anthropic, Google, Hugging Face, NVIDIA, Meta, Cohere, and Stability have all offered up their large language models for scrutiny.

They will supply gated-off versions of their models for attack by a range of hackers — from the conference’s usual experienced attendees to community college students flown in specifically for the challenge.

The idea for the White House’s involvement in the DEFCON exercise was born at an earlier tech conference: South by Southwest (SXSW) in Austin, Texas, said OSTP’s Prabhakar. After an initial meeting at SXSW, the AI Village organizers met with White House officials to discuss the possibility of scaling up their red-teaming exercise at DEFCON to feature the most popular large language models on the market. “We thought it was a terrific idea, a great way to seed something that really mattered,” Prabhakar said.

The firms agreed, although there’s a caveat: The results from the DEFCON red-teaming exercise won’t be made public until February, so they can fix security holes or problems before they get exploited. With AI, the process is complicated. “It's not as simple as just patching like a software flaw,” said Meta security researcher Chris Rohlf.

For industry, the stakes include winning public trust for an emerging technology that has ignited both widespread anxiety and excitement. “Showing that these models are tested,” said Meta’s Rohlf, “will build trust with the community long term.” Michael Sellitto, head of geopolitics and security policy at Anthropic, meanwhile, is hopeful that the exercise will spark a safety competition in the tech industry itself. “One of the things that we really want to see is a safety race to the top,” he said.

Despite the fanfare, the exercise itself is not likely to reveal all the ways in which AI systems can misbehave, especially since each participant gets very limited time to hack into a large language model (on the order of 15 minutes per try) and are limited to the technical equipment available at the event, said Anthropic’s Sellitto.

Mislove — the senior White House official involved in the red teaming planning process — said the Biden administration sees this DEFCON exercise as a model for the future. In part, it’s intended to find the best way to run more large-scale red teaming exercises on AI. The White House’s objective with DEFCON is to set a precedent: “Where we want to get to is a future in which red-teaming is widely done by many parties,” said Prabhakar.