Listen

Description

This June 2025 paper introduces Non-Penetrative Tensor Partitioning (NPTP), a novel method designed to improve the speed of collaborative inference for Deep Neural Networks (DNNs) on Internet of Things (IoT) devices. It addresses the common challenge of limited resources and strict latency requirements by minimizing the communication overhead that typically arises when large images are divided and processed across multiple devices. Unlike existing methods that utilize penetrative partitioning, which leads to substantial data sharing between devices, NPTP employs a non-penetrative approach and a Multilevel Partitioning Algorithm (MPA) to reduce this inter-device communication. Experimental results demonstrate that NPTP significantly outperforms state-of-the-art collaborative inference algorithms like CoEdge, achieving notable inference speedups, particularly for larger DNN models and image sizes, while maintaining device memory efficiency. The paper details the computational and communication overhead formulations, along with the algorithm design for optimal tensor partitioning.

Source:

https://arxiv.org/pdf/2501.04489