RTSS'17 | GPU Scheduling on the NVIDIA TX2: Hidden Details Revealed

Weile Luo published on 2022-01-10 included in Paper Notes

This paper configured multiple experiments to explore the rules of the GPU kernel-level scheduling.

NeurIPS'20 | BRP-NAS: Prediction-based NAS using GCNs

Weile Luo published on 2022-01-03 included in Paper Notes

The authors proposed BRP-NAS, an efficient hardware-aware NAS enabled by an accurate performance (latency and accuracy) predictor based on graph convolutional network (GCN). The BRP-NAS uses the binary relations of models and an iterative data selection strategy to improve the sample selection. In addition, they also released LatBench - a latency dataset of NAS-Bench-201 models running on abroad range of devices.

MobiSys'21 | nn-Meter: Towards Accurate Latency Prediction of Deep-Learning Model Inference on Diverse Edge Devices

Weile Luo published on 2021-12-20 included in Paper Notes

Only Chinese version is available

ATC'21 | Habitat: A Runtime-Based Computational Performance Predictor for Deep Neural Network Training

Weile Luo published on 2021-12-18 included in Paper Notes

CODE & VIDEO Abstract Habitat is a Python library that can predict performance on GPUs with the help of a GPU that the user already has. Current Apporaches and their Limitation The approaches for the DL performance analysis today include Directly measuring the training job on the GPU Using the benchmark. However, there are the limitations of these approaches: You need to have the GPU in the first place They are not as helpful in a custom DNN on a specific GPU.