Alibaba Cloud claims to slash Nvidia GPU use by 82% with new pooling system

Alibaba Group Holding has introduced a computing pooling solution that it said led to an 82 per cent cut in the number of Nvidia graphics processing units (GPUs) needed to serve its artificial intelligence models. The system, called Aegaeon, was beta tested in Alibaba Cloud’s model marketplace for more than three months, where it reduced the number of Nvidia H20 GPUs required to serve dozens of models of up to 72 billion parameters from 1,192 to 213, according to a research paper presented this...