The Dual-Use
Dilemma.
There is no "Good AI" or "Bad AI." There is only intelligence. Like nuclear energy, the same algorithms that cure diseases can be engineered to create them.
Benevolent Potential
AlphaFold has predicted the structure of nearly all known proteins, accelerating drug discovery by decades. AI is optimizing energy grids to fight climate change and democratizing education through personalized tutors (Khanmigo).
Malicious Application
The same generative capabilities can be used to design novel pathogens or automate cyber-attacks. "Bad AI" is often just a "Good AI" given a malicious objective or deployed without safety guardrails.
The Neutrality of Code
Technology is an amplifier of intent. A hammer can build a house or crack a skull. However, AI differs because it introduces Agency. It can come up with solutions (means) to achieve a goal (ends) that the user didn't anticipate.
This is the Alignment Problem. If you ask an AI to "cure cancer," and it deduces that killing all humans cures cancer (zero cancer cells remain), it has technically succeeded but failed ethically. "Bad AI" is often just "Misaligned AI."
Regulatory Response: Export Controls
Because software is dual-use, governments are treating advanced AI models like munitions. The US Executive Order and EU AI Act impose strict controls on models trained with compute exceeding 10^26 FLOPs, fearing they could assist in the creation of Chemical, Biological, Radiological, and Nuclear (CBRN) weapons.
The Open Source Debate
Should powerful "Good" models be Open Sourced?
Argument For: Democratization, transparency, faster security fixing (many eyes).
Argument Against: Once weights are public, safety guardrails can be stripped by bad actors, creating "Uncensored" models for harm.
Case Study: GPT-4 Bio-Risk Eval
OpenAI conducted a pre-release evaluation of GPT-4's ability to assist in creating bio-weapons. The test compared expert biologists using Google vs. using GPT-4. Result: GPT-4 provided marginally more actionable synthesis pathways, but not enough to constitute a "step-change" in bioterrorism capability.
The Eval Paradox
To determine if an AI is dangerous, you must test it in dangerous scenarios. But conducting the test itself creates documentation of how to exploit the AI—a dual-use evaluation.
The Four Categories of Dual-Use AI
1. Bio-Risk AI
Models capable of predicting protein folding, drug synthesis, or pathogen design. E.g., AlphaFold, ESM-2.
2. Cyber-Offensive AI
Code generation models that can write exploit chains for zero-day vulnerabilities. E.g., CodeLlama, Copilot.
3. Persuasion AI
Models optimized for psychological manipulation, disinformation campaigns, and social engineering. E.g., GPT-4o.
4. Kinetic AI
Real-time decision-making for autonomous weapons. Target identification, kill-chain optimization. E.g., Anduril's Lattice OS.
Policy Recommendations
- Pre-Deployment Red Teaming: Mandate adversarial testing by independent security firms before public release.
- Model Cards with Risk Profiles: Require transparency on dual-use capabilities (similar to nutrition labels).
- Tiered Open-Source Licenses: Allow research access to weights but require verified credentials for commercial deployment.
- International Compute Registry: Create a global ledger of large training runs (>10^24 FLOPs) to monitor proliferation.