Neural Network Compression Framework (NNCF)
This repository contains a PyTorch*-based framework and samples for neural networks compression.
The framework is organized as a Python* package that can be built and used in a standalone mode. The framework architecture is unified to make it easy to add different compression methods.
The samples demonstrate the usage of compression algorithms for three different use cases on public models and datasets: Image Classification, Object Detection and Semantic Segmentation. Compression results achievable with the NNCF-powered samples are reported here.
Support of various compression algorithms, applied during a model fine-tuning process to achieve best compression parameters and accuracy:
Automatic, configurable model graph transformation to obtain the compressed model. The source model is wrapped by the custom class and additional compression-specific layers are inserted in the graph.
Common interface for compression methods
GPU-accelerated layers for faster compressed model fine-tuning
Distributed training support
Configuration file examples for each supported compression algorithm.
Exporting compressed models to ONNX* checkpoints ready for usage with OpenVINO™ toolkit.