Nvidia发射Groq 3 LPU机架,以更快、更高效的AI推理,2026年末装运。
Nvidia launches Groq 3 LPU racks for faster, more efficient AI inference, shipping late 2026.
Nvidia推出了Groq 3语言处理股(LPU)及相关的LPX服务器机架,将Groq的技术纳入其Vera Rubin平台,以提高AI的推断速度和效率。
Nvidia has launched the Groq 3 Language Processing Unit (LPU) and associated LPX server racks, integrating Groq’s technology into its Vera Rubin platform to boost AI inference speed and efficiency.
该系统的特点是每架256个LPU,每秒提供多达1 500个标志,每瓦的输送量高出35倍,目标是万亿参数模型和人工智能代理工作量。
The system, featuring 256 LPUs per rack, delivers up to 1,500 tokens per second with 35 times higher throughput per watt, targeting trillion-parameter models and agentic AI workloads.
该平台旨在补充Nvidia的Rubin GPUs和Vera CPUs, 该平台旨在减少潜伏和电力使用,同时增加每百万面值的收入。
Designed to complement Nvidia’s Rubin GPUs and Vera CPUs, the platform aims to reduce latency and power use while enabling higher revenue per million tokens.
Groq 3 LPX机架预计于2026年底投入使用,Nvidia公司还推出开放源码Dynamo 1.0软件平台,以简化人工智能的大规模推断。
The Groq 3 LPX racks are expected to ship in late 2026, with Nvidia also introducing the open-source Dynamo 1.0 software platform to streamline large-scale AI inference.
这一举动标志着在竞争日益激烈和来自超大型公司和AI服务供应商的需求不断增长的情况下,向专业化推断硬件的战略转变。
The move marks a strategic shift toward specialized inference hardware amid growing competition and rising demand from hyperscalers and AI service providers.