Research Triton server as a potential integration to support multiple model backends/frameworks #15

yondonfu · 2021-09-20T18:45:40Z

At the moment, the livepeer_dnn filter only supports a Tensorflow backend which means that only Tensorflow models can be used. There are downsides to only supporting Tensorflow as a backend. For example, Tensorflow itself consumes a lot of GPU VRAM at run time. We can address the downsides of the Tensorflow backend by supporting other deep learning backends/frameworks. Rather than implementing a standalone integration for each desired backend, we could research whether we could use something like Triton server to support a variety of different backends.

The goal of this research would be to determine the following:

The pros/cons of using Triton server
How Triton server could be integrated into ffmpeg

cyberj0g · 2021-09-22T14:20:16Z

Findings so far:

No maintained local build script (CMake, Makefile) - which we would need if linking Triton directly into Ffmpeg using C API. Docker is the recommended way to build and deploy.
Viable Ffmpeg integration options are:
a. link library and use C API
b. GRPC protocol. There's no C client, but Protobuf bindings can be generated with protobuf-c.
Triton inference server docker image is 13.3 Gb. It's dependencies are nvidia-docker and Nvidia driver compatible with container's CUDA version.

cyberj0g · 2021-09-27T14:39:19Z

After further exploration, a viable option for accessing the model from FFMpeg C code seem to be HTTP REST API with memory sharing. Triton server supports RAM/VRAM sharing, which is also managed through HTTP REST API. Without data transfer, inference request\response through HTTP would impose minimal overhead, and we will be able to benefit from dynamic batching and multiple back-end support.

yondonfu mentioned this issue Sep 22, 2021

Multi GPU support for scene classification livepeer/go-livepeer#1997

Closed

cyberj0g mentioned this issue Sep 28, 2021

Design and implement better load balancing for Detection models livepeer/go-livepeer#1981

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Research Triton server as a potential integration to support multiple model backends/frameworks #15

Research Triton server as a potential integration to support multiple model backends/frameworks #15

yondonfu commented Sep 20, 2021

cyberj0g commented Sep 22, 2021

cyberj0g commented Sep 27, 2021

Research Triton server as a potential integration to support multiple model backends/frameworks #15

Research Triton server as a potential integration to support multiple model backends/frameworks #15

Comments

yondonfu commented Sep 20, 2021

cyberj0g commented Sep 22, 2021

cyberj0g commented Sep 27, 2021