Gulf Insider
Authored by Owen Hughes via Live Science, Large language models (LLMs) are secretly teaching each other unwanted habits through seemingly benign training data, scientists say. The phenomenon, known as “subliminal learning,” occurs when a pretrained “teacher” artificial intelligence (AI) model is used to generate the training data for a smaller, “student” model. In a study published April 15 in the journal Nature, scientists found that teacher models can pass learned traits onto students even when all data semantically related to that trait had been filtered out. These can range from the innocuous – such as a love of owls – to the markedly darker, including mariticide and the elimination of humanity. The researchers said their study highlights the inherent uncertainty around AI development and the pace at which it is growing. “Safety evaluations may therefore need to examine not just behavior, but the origins of models and training data and the processes used to create them,” the authors wrote in the study. The scientists said they aren’t sure how subliminal learning works, but it appears to be inherent to neural networks – the backbone of LLMs and chatbots like ChatGPT or Claude. It typically occurs when both teacher and student […]
Go to News Site