The Promise and Peril of DeepSeek
Favicon 
spectator.org

The Promise and Peril of DeepSeek

There’s an old story that, sometime in the 1960s, NASA spent millions to create a pen that worked in space. American engineers worked for years on a high-tech marvel that could write in zero gravity. Facing the same problem, the Russians used a pencil. It’s a myth, unfortunately. There is a “Space Pen,” but it was developed without government funding; pencil shavings turn out to be unacceptable fire hazards. But the story is too perfect to die: a perfect critique of technocracy and groupthink that — alas! — never actually happened. In January, AI found its pencil. Some technical background is necessary here. Any large language model, like ChatGPT or Grok, must be trained on human-written text. The AI assigns numeric values to different words; by combining these numbers, it predicts what words will come next. When its predictions are wrong — when they aren’t human-like — the AI adjusts its numbers and tries again. This process requires extraordinary amounts of time and computation. To accelerate it, developers use specialized hardware called Graphics Processing Units, or GPUs, that are optimized for these kinds of calculations. (They’re prized by bitcoin miners for the same reason.) Unfortunately, the best GPUs are ruinously expensive — and an AI lab might need thousands of them. That’s a high hurdle, and it’s one reason only a handful of companies have invested in top-tier AI models. Why Is DeepSeek a Disruptor? All of that might be about to change. Two weeks ago, the Chinese AI company DeepSeek released a new model named R1, which is roughly as powerful as the best existing systems. But DeepSeek claims they trained their model for only $5 million — still a heavy sum, but an order of magnitude less than rivals like OpenAI. DeepSeek’s success seems to rest on several innovations. All computers store numbers as patterns of ones and zeros, or bits. Computer bits work like scientific significant digits: the more bits used to represent a number, the more accurately it can be stored. Most modern AIs use 32 bits per number; R1 sometimes uses only eight. That makes every calculation less precise, but also cheaper and faster — meaning DeepSeek can quickly build large models without top-shelf GPUs. As well, R1 uses what’s called a mixture of experts (MoE) model. ChatGPT is built on a single model that tries to “know” everything: it can write a poem, suggest programming code, or diagnose medical symptoms. Companies without OpenAI’s resources have instead built more targeted AIs; an engineering firm might train its system to design bridges, without teaching it to write a sonnet. An MoE approach builds on this idea, training dozens of smaller models as “experts” in different areas. When a user asks R1 a question, it can hand the problem off to whichever expert matches the problem most closely. The result might not be as flexible as ChatGPT, but the individual experts can be easier (and cheaper) to train than one all-knowing model. It’s hard to know how to react to all this. It’s clearly bad news for companies like Nvidia, maker of those high-end GPUs, whose stock fell by more than $600 billion after R1’s release. More broadly, it marks the first real challenge to American dominance in AI, with some experts warning that the US must quickly expand funding for further work. On the other hand, it’s not obvious a slight edge in AI offers a commanding strategic advantage — and surely R1’s success shows that “shovel more money at the problem” is not always a winning approach. It’s also possible that R1’s budget is as phony as that Russian space pencil. For their part, OpenAI claims that DeepSeek effectively stole their training work — roughly, that the Chinese taught their AI by checking its answers against the American model. If true, R1 isn’t a leap forward; it’s a Temu ChatGPT. There’s a certain delicious irony here. For years, generative AI companies have ruthlessly harvested online text and art, waving off claims of copyright violation. OpenAI’s success depends on this data — but the original authors and artists will see none of the profits. If the tech giant now feels that its own data has been unfairly harvested, well, that’s sauce for the goose. Or, as one wit on X put it: “i cant believe ChatGPT lost its job to AI.” Maybe all of this is a reason not to heavily subsidize AI. Investment rests on the idea that the best model will give its inventor an advantage over rivals, economically or militarily. If true, early AI innovators might gain a lead that snowballs into larger and larger gaps. But if gaps in AI performance can be quickly closed — whether by fair means or foul — then there’s less incentive to get there first. The result might be slower, more contemplative development — and a world in which no one holds an eternal advantage. READ MORE from Brian Dellinger: Making Friends: AI and Companionship An Eye on AI: Five New Things to Watch in October The Internet Has a New Problem: Deepfaked Pornography The post The Promise and Peril of DeepSeek appeared first on The American Spectator | USA News and Politics.