This paper configured multiple experiments to explore the rules of the GPU kernel-level scheduling.
The authors proposed BRP-NAS, an efficient hardware-aware NAS enabled by an accurate performance (latency and accuracy) predictor based on graph convolutional network (GCN). The BRP-NAS uses the binary relations of models and an iterative data selection strategy to improve the sample selection. In addition, they also released LatBench - a latency dataset of NAS-Bench-201 models running on abroad range of devices.
Only Chinese version is available
CODE & VIDEO
Abstract Habitat is a Python library that can predict performance on GPUs with the help of a GPU that the user already has.
Current Apporaches and their Limitation The approaches for the DL performance analysis today include
Directly measuring the training job on the GPU Using the benchmark. However, there are the limitations of these approaches: You need to have the GPU in the first place They are not as helpful in a custom DNN on a specific GPU.