Amazon.com, Inc. (NASDAQ:AMZN) announced Friday a collaboration between its cloud division, AWS, and AI hardware company Cerebras Systems.
The partnership aims to deliver the world's fastest AI inference for large language models (LLMs).
Unmatched Speed Through Disaggregation
The new solution integrates AWS Trainium chips with Cerebras CS-3 systems. By using “inference disaggregation,” the system splits workloads. Trainium handles “prefill” (input processing), while the CS-3 focuses on “decode” (output generation).
David Brown, Vice President at AWS, stated, “The result will be inference that’s an order of magnitude faster and higher performance than what’s available today.”
Exclusive Access via Amazon Bedrock
The technology will be deployed within AWS data centers. Customers can access these speeds through Amazon Bedrock starting in the next couple of months.
AWS is the first cloud provider to offer Cerebras’ specialized hardware for disaggregated inference. Later this year, AWS will add support for Amazon Nova and other open-source models using this infrastructure.
“Partnering with AWS… will bring the fastest inference to a global customer base,” noted Cerebras CEO Andrew Feldman. Cerebras is also powering OpenAI with massive computing capacity.
Built on the AWS Nitro System, the setup ensures enterprise-grade security and isolation. Cerebras Systems competes with major AI chipmakers such as NVIDIA Corp (NASDAQ:NVDA) and Advanced Micro Devices (NASDAQ:AMD).
AMZN Price Action: Amazon.com shares were down 0.98% at $207.48 at the time of publication on Friday, according to Benzinga Pro data.
Image via Shutterstock
Login to comment