Collector: Hızlı ve Tarafsız Haber

PCWorld
6 saat, 46 dakika

Anthropic just wrote itself a safety loophole

“Safety first” was the mantra that made Anthropic unique among its big AI competitors. The company’s pledge originally went like this: If Anthropic, the maker of Claude, couldn’t guarantee a new model would meet its stringent safety standards, it would stop training that model, even if its competitors forged ahead. But at the very moment when Anthropic would seem to need its “safety first” pledge the most–namely, during its standoff with the Pentagon–the company has revealed a revised policy that adds a critical loophole. As first reported by Time , version 3.0 of Anthropic’s Responsible Scaling Policy backtracks on the company’s earlier promise, allowing it to continue training potentially hazardous models that its rivals are actively working on. The new policy also includes mandates for greater transparency about AI safety, along with a vow to “delay” development of dangerous models if it considers itself comfortably ahead of its competitors. This is all happening during a crucial moment for Anthropic, which is facing a Friday deadline to acquiesce to the Pentagon’s demand for wide-ranging access to Anthropic’s models for military use. During a meeting at the Pentagon with Anthropic CEO Dario Amodei on Tuesday, Defense Secretary Pete Hegseth reportedly threatened to use the Defense Production Act to force Anthropic to hand over its models, which the military wants to use for “any lawful purpose,” according to The New York Times. Anthropic is said to be holding fast , demanding a promise from Hegseth that the Pentagon not use its models for “autonomous weapons” or to spy on Americans. But some view Anthropic’s revised Responsible Scaling Policy as an escape hatch for the current pickle it’s in, allowing the company to potentially give in to the Pentagon’s demands while keeping square with its own safety policies. It’s worth noting that Elon Musk-controlled xAI and OpenAI have already reached agreements with the Pentagon . For its part, Anthropic is pushing back on the idea that its revised RSP has weakened its safety policies, arguing instead that the old “red lines” were outdated given the government’s hands-off stance on AI. By giving itself the latitude to continue training unstable models that others are actively developing, Anthropic will be able to act as a steadying force rather than allow more reckless companies to become the AI industry’s leaders, Anthropic says. Maybe so, and hopefully they’re right. Still, it’s unsettling to see the AI company that has always stood for safety rewriting its own rulebook.

Go to News Site