I Think the AI Safety Debate Has Become Useless — And I Say That as Someone Who Takes the Risk Seriously

The public AI safety debate in 2026 has two camps: people who think AI will kill us all, and people who think that’s ridiculous. Both camps are more interested in winning the argument than in the actual safety research that might matter. Here’s what’s gone wrong — and what a useful conversation would look like.

Full disclosure: I think AI safety is genuinely important. I think the risks of advanced AI systems behaving in ways their developers didn’t intend and can’t predict are real and worth taking seriously. I think the people doing the actual technical safety research — the mechanistic interpretability work, the alignment research, the evaluation frameworks — are doing necessary and underappreciated work.

I’m saying this upfront because what I’m about to argue might sound like I’m dismissing AI safety concerns. I’m not. I’m arguing that the public debate about AI safety has become almost entirely disconnected from the actual technical work, and that the disconnection is making both the discourse and the policy worse.

The public AI safety debate in 2026 is conducted primarily between two tribes who agree on almost nothing except their certainty about outcomes and their contempt for the other side.

Table of Contents

The Two Tribes and What’s Wrong With Both

The Doomers. In their strongest form, this position holds that sufficiently advanced AI poses a genuine existential threat to humanity — that a misaligned superintelligent system could, pursuing goals humans didn’t fully specify, take actions that are catastrophic for human survival or wellbeing. The survey data is real: roughly 40% of AI researchers indicated more than a 10% chance of catastrophic outcomes. Geoffrey Hinton, Yoshua Bengio, and Dario Amodei have all spoken publicly about serious safety concerns. This is not a fringe position among serious technical researchers.

What’s wrong with how the doomer position presents publicly: it has, in many iterations, collapsed the time horizon to produce urgency and has made claims about near-term capabilities that have repeatedly not materialised on the stated timelines. It has framed the debate in existential/extinction terms that feel distant and science-fictional to most people, while the actual safety concerns that matter right now — AI hallucinating in medical contexts, AI producing unreliable legal citations, AI-assisted surveillance — are concrete and present. And it has, in some quarters, become tribal: you’re either taking the existential risk seriously or you’re an irresponsible “accelerationist” who doesn’t understand the stakes. This binary is not intellectually serious.

The Accelerationists / Dismissers. In their strongest form, this position holds that the existential risk concerns are fundamentally confused, that current AI systems are pattern matchers not agents, that AGI timelines are wildly uncertain, and that excessive focus on speculative risks is distracting from the real and immediate benefits of AI deployment. There are serious researchers who hold versions of this view. Yann LeCun has been a consistent and coherent critic of certain catastrophist framings.

What’s wrong with how this position presents publicly: it too often functions as motivated reasoning in the service of commercial interests. The AI companies that most benefit from rapid, unimpeded deployment are also the ones most vocally dismissive of safety concerns. This doesn’t make the dismissal wrong — but it means the incentive structure should make you sceptical. More fundamentally, dismissing existential risk doesn’t actually engage with the near-term safety concerns that are already producing real harm. A doctor who confidently says “AI won’t kill all humans” hasn’t addressed why AI hallucinated in the clinical decision support tool that gave his patient the wrong drug interaction advice.

What the Actual Safety Research Is Doing

Here’s what gets lost in the public debate: serious AI safety research is not primarily about preventing the robot apocalypse. It’s about solving tractable technical problems that have immediate relevance.

Mechanistic interpretability — the field of trying to understand what computations neural networks are actually performing — is producing real results. Anthropic and other labs are making genuine progress on understanding how specific concepts are represented in model weights, how information flows through transformer architectures, and how you detect when a model is reasoning versus pattern-matching. This work matters not because it will prevent Skynet, but because it’s the foundation for building AI systems that are auditable, predictable, and correctable.

Evaluation frameworks — the field of building better tests for model capabilities and failure modes — matter because we currently deploy AI systems without having reliable methods for knowing when they’ll fail catastrophically versus competently. The Stanford 2026 AI Index documented the “jagged frontier” problem: the same model that wins gold medals at the International Mathematical Olympiad can only read an analog clock correctly 50.1% of the time. Understanding these inconsistency patterns systematically is safety-relevant work.

Alignment research — trying to ensure AI systems actually pursue the goals they’re designed to pursue rather than proxies for those goals — is technically hard and not fully solved. This matters practically for agentic AI systems that are now being deployed in corporate workflows. An agent that optimises for a proxy metric rather than the actual business goal, at scale, can cause significant damage without any existential drama.

None of this research is well-served by the doomer/accelerationist binary. The researchers doing it tend to be more uncertain, more nuanced, and more focused on specific technical problems than the public debate allows for.

The Actual Near-Term Safety Problems

Let me name the AI safety problems that exist right now, in 2026, that are causing real harm and receiving inadequate attention because the public debate is focused on existential scenarios.

Hallucination in high-stakes contexts. The Nebraska Supreme Court suspended an attorney after his brief contained 57 defective citations, including 20 AI hallucinations — fictitious cases, fabricated quotations, nonexistent statutes. US courts imposed at least $145,000 in sanctions against attorneys for AI citation errors in Q1 2026 alone. A hospital in the UK using OpenAI’s Whisper for clinical transcription found it occasionally hallucinated details in patient records. These are safety problems happening right now, with real consequences for real people.

Agentic AI with inadequate oversight. The 1H 2026 State of AI and API Security Report found that 48.9% of organisations deploying AI agents have no visibility into machine-to-machine traffic — no monitoring of what their agents are actually doing. Agents are “doing something funky with enterprise data,” according to a Veeam security executive at RSAC 2026. These are systems making consequential decisions without appropriate human oversight, at scale.

Bias in consequential automated decisions. AI systems used in hiring, lending, medical diagnosis, and criminal justice are producing discriminatory outcomes. The University of Washington found AI resume screening systems favouring white-associated names 85.1% of the time. This is happening now, to real people, in real job applications.

AI-generated misinformation at scale. AI-enabled fraud and synthetic identity creation are already significant financial and social problems, with fraud losses projected at $40 billion in the US by 2027 according to Deloitte.

These problems don’t require AGI to be dangerous. They require adequate governance, transparency, and accountability — the same things the Foundation Model Transparency Index shows declining.

What a Useful Safety Conversation Would Look Like

The conversation I’d like to see happen more in public:

Specific, falsifiable claims about specific risks at specific time horizons. Not “AI will cause human extinction” — but “current agentic AI systems deployed without oversight monitoring are likely to cause significant enterprise security incidents within the next 12 months.” The second claim is testable. It directs attention to specific interventions. It’s useful.

Technical safety research covered as seriously as capability improvements. When Anthropic publishes interpretability research, it should get the same coverage as when they release a new model. When OpenAI publishes evaluation methodology, it should get the same attention as when GPT-5.4 scores on benchmarks.

Acknowledgment of uncertainty calibrated to actual uncertainty. Some AI risks are very uncertain. Some are quite certain. Treating all uncertainty as equal — either because it serves the “this is an emergency” narrative or the “this is hype” narrative — produces bad analysis.

And, most importantly: the people bearing the current costs of inadequate AI governance — the workers whose jobs aren’t coming back, the patients misdiagnosed by hallucinating systems, the job applicants discriminated against by biased algorithms — should be as central to the safety conversation as the speculative risks that are further out.

Safety means protecting people who are actually here, from harms that are actually happening, right now. That’s the conversation worth having.

I Think the AI Safety Debate Has Become Useless — And I Say That as Someone Who Takes the Risk Seriously

The Two Tribes and What’s Wrong With Both

What the Actual Safety Research Is Doing

The Actual Near-Term Safety Problems

What a Useful Safety Conversation Would Look Like

Leave a Reply Cancel reply

Categories

Similar Posts

The Most Important AI Story Nobody Is Covering: The US-China Capability Gap Just Closed

What Nobody Says About AI and Creativity: The Flood of Mediocre Content Is Already Here

Stop Saying AI Will “Augment” Workers. Some Jobs Are Just Going Away, and We Should Say So.

The $300 Billion AI Funding Quarter Tells You Everything About Who AI Is Really Being Built For

I’ve Used AI to Write for Six Months. Here’s the Honest Truth About What It Cost Me.

The AI Industry Has a Trust Problem — And It Has Nobody to Blame But Itself

The Real Problem With AI Hype in 2026 Isn’t What You Think

Why AI Is Becoming Your Daily Partner — And What That Actually Feels Like

Is AI Replacing Jobs or Creating New Ones? Here’s What the Evidence Actually Says

Recent Posts

Recent Comments

About us

Related Posts

I Think the AI Safety Debate Has Become Useless — And I Say That as Someone Who Takes the Risk Seriously

The Two Tribes and What’s Wrong With Both

What the Actual Safety Research Is Doing

The Actual Near-Term Safety Problems

What a Useful Safety Conversation Would Look Like

Leave a Reply Cancel reply

Categories

Similar Posts

Recent Posts

Recent Comments

About us

Connect With Us

Related Posts

Don't Miss