This repository is the result of my curiosity to find out whether ShelfNet is an efficient CNN architecture for computer vision tasks other than semantic segmentation, and more specifically for the human pose estimation task. The answer is a clear yes, with 74.6 mAP and 127 FPS on the MS COCO Keypoints data set which represents a 3.5x boost in FPS compared to HRNet for a similar accuracy.
This repository includes:
Source code of ShelfNet modified from the authors' repository
Code to prepare the MS COCO keypoints dataset
Training and evaluation code for MS COCO keypoints modified from the HRNet authors' repository
Pre-trained weights for ShelfNet50
If you use it in your projects, please consider citing this repository (bibtex below).