/images/avatar.jpg

Weile Luo

MobiSys'21 | nn-Meter: Towards Accurate Latency Prediction of Deep-Learning Model Inference on Diverse Edge Devices

Introduction

This paper won the Best Paper Award at MobiSys 2021. The paper proposes nn-Meter, a model inference time prediction system that can efficiently and accurately predict the inference latency of DNN models on different edge devices. The key idea is to divide the entire model into kernels (execution units on the device), and then perform kernel-level prediction.

Background

As more and more deep neural networks emerge, systems for predicting network inference performance need to be generalizable to adapt.

ATC'21 | Habitat: A Runtime-Based Computational Performance Predictor for Deep Neural Network Training

CODE & VIDEO

Abstract

Habitat is a Python library that can predict performance on GPUs with the help of a GPU that the user already has.

Current Apporaches and their Limitation

The approaches for the DL performance analysis today include

  1. Directly measuring the training job on the GPU
  2. Using the benchmark. However, there are the limitations of these approaches:
    1. You need to have the GPU in the first place
    2. They are not as helpful in a custom DNN on a specific GPU. Another approach is to use heuristics, which assumes that DNN training workload exhaust all the computational resources on a GPU, which is not true in general.

Observations

Hence, habitat based on the observations: