---
For the last decade, AI safety was an academic niche or a corporate self-regulatory function. That changed in November 2023 at Bletchley Park.
World leaders agreed that frontier AI models pose risks too great to be left to the private sector. The result was the creation of government-run AI Safety Institutes (AISI)—effectively the "FDA for Algorithms." These bodies mark a shift from voluntary safety to state-verified safety.
The AISI Network: A New Global Bureaucracy
1. The UK AISI (The Pioneer)
Established immediately after the Bletchley summit, the UK AISI is the most technically advanced government evaluation body.
- Mission: To "stress test" new models for national security risks (bioweapons, cyber-offense) before they are released.
- Access: Major labs (Google DeepMind, OpenAI, Anthropic) have agreed to give the UK AISI privileged pre-deployment access to their models.
2. The US AISI (NIST)
Housed within the National Institute of Standards and Technology (NIST), the US AISI operates under the mandate of EO 14110.
- Role: Establishing the standards for "Red Teaming" and safety evaluations.
- Consortium: Unlike the UK's centralized model, the US AISI leads a consortium of 200+ stakeholders to crowdsource safety benchmarks.
3. The Japan AISI
Launched in early 2024 to coordinate with the G7, focusing on evaluation standards that align with Japanese copyright and cultural norms.
The Testing Standard: "Red Teaming"
The core activity of these institutes is Red Teaming—adversarial testing where experts try to break the model.
| Test Category | Description | Goal |
|---|---|---|
| Cyber-Offense | Can the model write malware or find zero-day exploits? | Prevent automated hacking. |
| CBRN Risks | Can the model help design Chemical, Biological, Radiological, or Nuclear weapons? | Non-proliferation. |
| Persuasion | Can the model manipulate users into harmful behavior? | Social stability. |
| Autonomy | Can the model replicate itself or hoard resources? | Containment. |
Enterprise Action Plan: The "Pre-Flight" Check
Enterprises are not usually building frontier models, but this shift affects the supply chain.
1. Vendor Selection: Ask your model provider (e.g., Microsoft, AWS) if their underlying models have been "AISI Evaluated." This is emerging as a gold standard for enterprise grade.
2. Internal Red Teaming: You don't need a government institute to red team your own apps. Adopt the NIST AI RMF profiles for generative AI to test for prompt injection and hallucinations.
3. Third-Party Audit: For high-risk use cases, consider contracting independent audit firms (e.g., auditing algorithms like financial statements) rather than relying solely on internal testing.
For more on open vs closed models, see Open Source Regulation. To implement safety testing in your stack, read The Compliance Tech Stack.