MobiSys'21 | nn-Meter: Towards Accurate Latency Prediction of Deep-Learning Model Inference on Diverse Edge Devices
Introduction
This paper won the Best Paper Award at MobiSys 2021. The paper proposes nn-Meter, a model inference time prediction system that can efficiently and accurately predict the inference latency of DNN models on different edge devices. The key idea is to divide the entire model into kernels (execution units on the device), and then perform kernel-level prediction.
Background
As more and more deep neural networks emerge, systems for predicting network inference performance need to be generalizable to adapt.