AI Evaluators Struggle with Models That Know When They’re Being Tested

The Information

AI Evaluators Struggle with Models That Know When They’re Being Tested

AI researchers are starting to make progress on a confounding problem: AI models are getting better at telling when they are in an evaluation. That could become a problem for AI companies that use evaluations to gauge the capabilities and behaviors of their models before releasing them. If models act differently during testing, that could mean they get released with undesirable tendencies. It could also undermine their creators’ ability to show off test scores to potential clients. Evaluations are important for “convincing customers that our products are better at their use case than other products,” said Silas Alberti , who works on evaluations at Cognition , the AI coding startup. And as models get smarter, they are gaining even more eval awareness, as researchers call it. For example, in testing of its non-public Mythos model, Anthropic found that Mythos more often mentioned that it was being tested than its predecessors Claude Opus 4.6 and Sonnet 4.6 .

Go to News Site