
AI and ML processing power has become an important metric in determining the capability of modern hardware, be it for CPUs, GPUs, or NPUs. Aside from processor power itself, AI tasks also gobble up plenty of memory depending on the chosen parameter and the precision of the model. For example, single precision (float32) typically consumes four times the parameter value. Thus, even a 32GB RTX 5090 can be saturated by inferencing an eight-billion parameter model.
Graphics cards like the AMD RX 7900 XTX, Nvidia RTX 4090, and RTX 5090 rely on their available 24-32 GB VRAM buffer. While those are typically desktop hardware, there have also been attempts to outdo such specs in mobile form factor too. AMD for example released the Ryzen AI Max and Max+ APUs earlier this year at CES, and they can allocate up to 96GB of memory. Thanks to that, AMD claims the Max+ 395 can easily beat Nvidia's RTX 4090 in such a memory-demanding load.
Meanwhile, Phison has already been working on such technology for a while which helps hardware not run out of available buffer. It does so with its aiDAPTIV+ suite of features that helps in dynamic caching by expanding on the available HBM (high-bandwidth memory) / GDDR and combining it with NAND flash-based "aiDAPTIVCache." Phison's own AI100 SSDs are employed for the caching purpose. Caching on NAND is cheaper since NAND itself is typically much cheaper than something like HBM or even GDDR.
Something similar was first tried by AMD when it launched the Radeon Pro SSG back at SIGGRAPH 2017. The company used 2TB NVMe as the GPU's on-board VRAM.
Phison's aiDAPTIV+ tech is already available in Maingear's AI PRO desktop workstations and today at GTC 2025, the companies are launching a new concept laptop. There are some new enhancements to aiDAPTIV+ which includes new version 3.0 of aiDAPTIVLink middleware. This middleware is responsible for the data transfer between the SSD's NAND and GPU.
According to Phison, aiDAPTIVLink version 3.0 delivers faster Time to First Token (TTFT) recall response and greater LLM (large language model) prompt contexts thanks to longer token length.
The chart below shows Phison's aiDAPTIV+ easily outperforming Maingear's quad Nvidia RTX 6000 Ada setup without aiDAPTIV+ once the model size exceeds 13 billion. In case you are not familiar with it, each RTX 6000 Ada packs 48 GB of GDDR6 memory.

Maingear has explained how dynamic caching works on its workstations using Phison's aiDAPTIV+:
PRO AI SHODAN dynamically slices your 70B training model, serving the current slices to the GPU for high-speed training while storing the rest of the model in DRAM and the specialized Phison AI100 SSDs. Each NVIDIA RTX 6000 Ada runs at full performance during training with minimal downtime.
Phison says Maingear's upcoming concept AI laptop as the "industry's first" such device that supports inferencing up to 8 billion parameter LLMs. The company says that those who are interested in it can register on Maingear's website. No pricing info was announced. The new technologies will be available starting April 2025.
6 Comments - Add comment