Skip to content

OpenVINO

Thanos Masouris edited this page Sep 3, 2022 · 1 revision

The OpenVINO (Open Visual Inference and Neural network Optimization) Toolkit is an open-source tool, originally developed by Intel, that enables optimization of deep learning models and deployment on Intel hardware using a developed inference engine.

Model Optimizer

The purpose of OpenVINO's model optimizer is to convert deep learning models developed on several frameworks (e.g. TensorFlow, PyTorch, Caffe, etc.) to an Intermediate Representation (IR) of the model which enables inference on OpenVINO's runtime. The produced IR model is optimized for the selected target device, improving the inference speed and at the same time keeping the model's accuracy constant. The IR model can be further optimized using the OpenVINO's Post-training Optimization Tool. For more information about the model optimizer, refer to the official documentation.

Model Optimizer Workflow

Model Optimizer Workflow (source)

Post-training Optimization Tool

OpenVINO's Post-training Optimization Tool provides two quantization methods to optimize a model's performance. Since it is performed post-training, it does not require a dataset, but rather a representative calibration set (e.g., 300 samples). Furthermore, the model has to be first converted to OpenVINO's IR format. Granted these requirements, a floating-precision model (FP32 or FP16) can be quantized to 8-bit integer-precision using either the Default Quantization algorithm, or the Accuracy-aware quantization algorithm. The former is recommended as a first step, since it leads to satisfactory performance for the majority of the cases and it only requires an unannotated calibration dataset, while the latter focuses on keeping the accuracy at a specific range, thus requiring an annotated calibration dataset. Figure 2 showcases the optimization workflow.

Post-training Model Optimization Workflow

Post-training Model Optimization Workflow (source)