UK, US, EU Authorities Launch New AI Safety Institutes

This week, authorities from the U.K., E.U., U.S., and seven other nations gathered in San Francisco to launch the “International Network of AI Safety Institutes.”

The meeting, which took place at the Presidio Golden Gate Club, addressed managing the risks of AI-generated content, testing foundation models, and conducting risk assessments for advanced AI systems. AI safety institutes from Australia, Canada, France, Japan, Kenya, the Republic of Korea, and Singapore also officially joined the Network.

In addition to signing a mission statement, more than $11 million in funding was allocated to research into AI-generated content, and the results of the Network’s first joint safety testing exercise were reviewed. Attendees included regulatory officials, AI developers, academics, and civil society leaders to aid the discussion on emerging AI challenges and potential safeguards.

The convening built on the progress made at the previous AI Safety Summit in May, which took place in Seoul. The 10 nations agreed to foster “international cooperation and dialogue on artificial intelligence in the face of its unprecedented advancements and the impact on our economies and societies.”

“The International Network of AI Safety Institutes will serve as a forum for collaboration, bringing together technical expertise to address AI safety risks and best practices,” according to the European Commission. “Recognising the importance of cultural and linguistic diversity, the Network will work towards a unified understanding of AI safety risks and mitigation strategies.”

Member AI Safety Institutes will have to demonstrate their progress in AI safety testing and evaluation by the Paris AI Impact Summit in February 2025 so they can move forward with discussions around regulation.

Key outcomes of the conference

Mission statement signed

The mission statement commits the Network members to collaborate in four areas:

Research: Collaborating with the AI safety research community and sharing findings.
Testing: Developing and sharing best practices for testing advanced AI systems.
Guidance: Facilitating shared approaches to interpreting AI safety test results.
Inclusion: Sharing information and technical tools to broaden participation in AI safety science.

Over $11 million allocated to AI safety research

In total, Network members and several nonprofits announced over $11 million of funding for research into mitigating the risk of AI-generated content. Child sexual abuse material, non-consensual sexual imagery, and the use of AI for fraud and impersonation were highlighted as key areas of concern.

Funding will be allocated as a priority to researchers investigating digital content transparency techniques and model safeguards to prevent the generation and distribution of harmful content. Grants will be considered for scientists developing technical mitigations and social scientific and humanistic assessments.

The U.S. institute also released a series of voluntary approaches to address the risks of AI-generated content.

The results of a joint testing exercise discussed

The network has completed its first-ever joint testing exercise on Meta’s Llama 3.1 405B, looking into its general knowledge, multi-lingual capabilities, and closed-domain hallucinations, where a model provides information from outside the realm of what it was instructed to refer to.

The exercise raised several considerations for how AI safety testing across languages, cultures, and contexts could be improved. For example, the impact minor methodological differences and model optimisation techniques can have on evaluation results. Broader joint testing exercises will take place before the Paris AI Action Summit.

Shared basis for risk assessments agreed

The network has agreed upon a shared scientific basis for AI risk assessments, including that they must be actionable, transparent, comprehensive, multistakeholder, iterative, and reproducible. Members discussed how it could be operationalised.

U.S.’s ‘Testing Risks of AI for National Security’ task force established

Finally, the new TRAINS task force was established, led by the U.S. AI Safety Institute, and included experts from other U.S. agencies, including Commerce, Defense, Energy, and Homeland Security. All members will test AI models to manage national security risks in domains such as radiological and nuclear security, chemical and biological security, cybersecurity, critical infrastructure, and military capabilities.

SEE: Apple Joins Voluntary U.S. Government Commitment to AI Safety

This reinforces how top-of-mind the intersection of AI and the military is in the U.S. Last month, the White House published the first-ever National Security Memorandum on Artificial Intelligence, which ordered the Department of Defense and U.S. intelligence agencies to accelerate their adoption of AI in national security missions.

Speakers addressed balancing AI innovation with safety

U.S. Commerce Secretary Gina Raimondo delivered the keynote speech on Wednesday. She told attendees that “advancing AI is the right thing to do, but advancing as quickly as possible, just because we can, without thinking of the consequences, isn’t the smart thing to do,” according to TIME.

The battle between progress and safety in AI has been a point of contention between governments and tech companies in recent months. While the intention is to keep consumers safe, regulators risk limiting their access to the latest technologies, which could bring tangible benefits. Google and Meta have both openly criticised European AI regulation, referring to the region’s AI Act, suggesting it will quash its innovation potential.

Raimondo said that the U.S. AI Safety Institute is “not in the business of stifling innovation,” according to AP. “But here’s the thing. Safety is good for innovation. Safety breeds trust. Trust speeds adoption. Adoption leads to more innovation.”

She also stressed that nations have an “obligation” to manage risks that could negatively impact society, such as through causing unemployment and security breaches. “Let’s not let our ambition blind us and allow us to sleepwalk into our own undoing,” she said via AP.

Dario Amodei, the CEO of Anthropic, also delivered a talk stressing the need for safety testing. He said that while “people laugh today when chatbots say something a little unpredictable,” it indicates how essential it is to get control of AI before it gains more nefarious capabilities, according to Fortune.

Global AI safety institutes have been popping up through the last year

The first meeting of AI authorities took place in Bletchley Park in Buckinghamshire, U.K. about a year ago. It saw the launch of the U.K.’s AI Safety Institute, which has the three primary goals of:

Evaluating existing AI systems.
Performing foundational AI safety research.
Sharing information with other national and international actors.

The U.S. has its own AI Safety Institute, formally established by NIST in February 2024, that has been designated the network’s chair. It was created to work on the priority actions outlined in the AI Executive Order issued in October 2023. These actions include developing standards for the safety and security of AI systems.

SEE: OpenAI and Anthropic Sign Deals With U.S. AI Safety Institute

In April, the U.K. government formally agreed to collaborate with the U.S. in developing tests for advanced AI models, largely by sharing developments made by their respective AI Safety Institutes. An agreement made in Seoul saw similar institutes created in other nations that joined the collaboration.

Clarifying the U.S.’s position toward AI safety with the San Francisco conference was especially important, as the wider nation does not currently present an overwhelmingly supportive attitude. President-elect Donald Trump has vowed to repeal the Executive Order when he returns to the White House. California Governor Gavin Newsom, who was in attendance, also vetoed the controversial AI regulation bill SB 1047 at the end of September.