Listen

Description

As AI agents run for hundreds of turns with ninety-five percent KV-cache hit rates, the bottleneck shifts from compute to storage I/O. DualPath from Peking University, Tsinghua, and DeepSeek exploits idle decode-engine storage NICs to load KV-cache via RDMA, achieving nearly two times throughput on the same hardware. We break down the architecture, walk up the hardware ladder from Raspberry Pi clusters to DGX Spark rigs, and show that the minimum viable DualPath setup is eight Sparks with two switches for about thirty thousand dollars.