Neowin News Feed for: Inference

Google's new method makes LLMs faster and more powerful, and cheaper too

Paul Hill — Fri, 12 Sep 2025 14:12:01 +0000

Google Research has developed a new method that could make running large language models cheaper and faster. Here's what it has done. Read more...

Nvidia announces its most powerful GPU, the Blackwell Ultra, built for training agentic AI

Karthik Mudaliar — Mon, 19 May 2025 09:28:02 +0000

Nvidia has announced its latest GPU, the Blackwell Ultra, made to train the next generation of AI models. Read more...

Cerebras launches the world's fastest AI inference, 20X performance compared to NVIDIA

Pradeep Viswanathan — Tue, 27 Aug 2024 19:20:01 +0000

Cerebras Systems launched Cerebras Inference, the world's fastest AI inference solution. It's 20x faster than NVIDIA's solutions and offers 100x higher price-performance. Read more...

Google is making on-device machine learning easier on Android later this year

Usama Jawad — Fri, 09 Jul 2021 17:38:52 +0000

Google has announced the Android ML Platform. Coming this year, it will make on-device inference easier by offering a consistent API and deeper integration with the OS without too many dependencies. Read more...

Mipsology partners with OKI IDS to bring FPGA-accelerated ML applications to Japan

Ather Fawaz — Tue, 10 Nov 2020 05:00:01 +0000

Japanese software development firm OKI IDS will be combining its expertise with Mipsology's field-programmable gate arrays to bring high-quality machine learning applications to Japan. Read more...