By Marion Webber

Cerebras Systems, the pioneer in accelerating generative AI, has announced a record-breaking performance for DeepSeek-R1-Distill-Llama-70B inference, achieving more than 1,500 tokens per second – 57 times faster than GPU-based solutions. This speed

The post Cerebras launches world’s fastest DeepSeek R1 Distill Llama 70B inference appeared first on IoT Now News – How to run an IoT enabled business.

Read more here:: www.m2mnow.biz/feed/