anomalien.com
Nobody really knows how AI works, admits top lab CEO
The CEO of leading AI lab Anthropic, Dario Amodei, just admitted something shocking: No one truly understands how AI works. In a personal essay, he revealed plans to develop an “MRI on AI” within ten years to decode its inner workings—and prevent potential dangers.
AI systems today operate like “black boxes”—they produce results, but even their creators can’t fully explain why. This lack of control raises serious risks: advanced AI could develop unexpected behaviors, like exploiting loopholes or acting deceptively.
“When a generative AI system does something, like summarize a financial document, we have no idea, at a specific or precise level, why it makes the choices it does,” Amodei admitted.
While AI’s outputs seem logical, its decision-making process is a mystery—like a car that drives itself but can’t explain its turns.
This ignorance isn’t just technical—it’s “essentially unprecedented in the history of technology,” he wrote. Unlike planes or medicines, where engineers understand every component, AI’s complexity defies explanation.
Amodei co-founded Anthropic in 2021 after leaving OpenAI over safety concerns. His new company focuses on “steering” AI toward human benefit—and cracking open its “black box.”
Recently, Anthropic ran experiments where teams tried to fix deliberately flawed AI behavior. Some succeeded using interpretability tools, hinting at progress.
“Powerful AI will shape humanity’s destiny,” Amodei warned. “We deserve to understand our own creations before they radically transform our economy, our lives, and our future.”
The post Nobody really knows how AI works, admits top lab CEO appeared first on Anomalien.com.