yolotriton

Go (Golang) gRPC client for YOLO-NAS, YOLOv8 inference using the Triton Inference Server.

Installation

Use go get to install this package:

go get github.com/dev6699/yolotriton

Get YOLO-NAS, YOLOv8 TensorRT model

Replace yolov8m.pt with your desired model

pip install ultralytics
yolo export model=yolov8m.pt format=onnx
trtexec --onnx=yolov8m.onnx --saveEngine=model_repository/yolov8/1/model.plan

References:

Export of quantized YOLO-NAS INT8 model

Export quantized onnx model

from super_gradients.conversion.conversion_enums import ExportQuantizationMode
from super_gradients.conversion import DetectionOutputFormatMode
from super_gradients.common.object_names import Models
from super_gradients.training import models

# From custom model
# model = models.get(Models.YOLO_NAS_S, num_classes=1, checkpoint_path='ckpt_best.pth')
model = models.get(Models.YOLO_NAS_S, pretrained_weights="coco")
export_result = model.export(
    "yolo_nas_s_int8.onnx",
    output_predictions_format=DetectionOutputFormatMode.BATCH_FORMAT,
    quantization_mode=ExportQuantizationMode.INT8 # or ExportQuantizationMode.FP16
)

print(export_result)

Convert to TensorRT with INT8 builder

trtexec --onnx=yolo_nas_s_int8.onnx --saveEngine=yolo_nas_s_int8.plan --int8

References:

https://github.com/Deci-AI/super-gradients/blob/b5eb12ccd021ca77e947bf2dde7e84a75489e7ed/documentation/source/models_export.md

Start trinton inference server

docker compose up tritonserver

References:

https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/user_guide/model_repository.html

Sample usage

Check cmd/main.go for more details.

For help

go run cmd/main.go --help

  -b    Run benchmark.
  -i string
        Inference Image. (default "images/1.jpg")
  -m string
        Name of model being served (Required) (default "yolonas")
  -n int
        Number of benchmark run. (default 1)
  -o float
        Intersection over Union (IoU) (default 0.7)
  -p float
        Minimum probability (default 0.5)
  -t string
        Type of model. Available options: [yolonas, yolonasint8, yolov8] (default "yolonas")
  -u string
        Inference Server URL. (default "tritonserver:8001")
  -x string
        Version of model. Default: Latest Version

Sample usage with yolonasint8 model

go run cmd/main.go -m yolonasint8 -t yolonasint8 -i images/1.jpg

1. processing time: 123.027909ms
prediction:  0
class:  dog
confidence: 0.96
bboxes: [ 669 130 1061 563 ]
---------------------
prediction:  1
class:  person
confidence: 0.96
bboxes: [ 440 30 760 541 ]
---------------------
prediction:  2
class:  dog
confidence: 0.93
bboxes: [ 168 83 495 592 ]
---------------------

Sample usage to get benchmark results

go run cmd/main.go -m yolonasint8 -t yolonasint8 -i images/1.jpg  -b -n 10

1. processing time: 64.253978ms
2. processing time: 51.812457ms
3. processing time: 80.037468ms
4. processing time: 96.73738ms
5. processing time: 87.22928ms
6. processing time: 95.28627ms
7. processing time: 61.609115ms
8. processing time: 87.625844ms
9. processing time: 70.356198ms
10. processing time: 74.130759ms
Avg processing time: 76.93539ms

Results

Input	Ouput

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.devcontainer		.devcontainer
cmd		cmd
core		core
grpc-client		grpc-client
images		images
model_repository		model_repository
LICENSE		LICENSE
README.md		README.md
RELEASE.md		RELEASE.md
class.go		class.go
conn.go		conn.go
docker-compose.yml		docker-compose.yml
go.mod		go.mod
go.sum		go.sum
postprocess.go		postprocess.go
preprocess.go		preprocess.go
util.go		util.go
yolo.go		yolo.go
yolonas.go		yolonas.go
yolonasint8.go		yolonasint8.go
yolov8.go		yolov8.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

yolotriton

Installation

Get YOLO-NAS, YOLOv8 TensorRT model

Export of quantized YOLO-NAS INT8 model

Start trinton inference server

Sample usage

Results

About

Releases

Packages

Languages

License

dev6699/yolotriton

Folders and files

Latest commit

History

Repository files navigation

yolotriton

Installation

Get YOLO-NAS, YOLOv8 TensorRT model

Export of quantized YOLO-NAS INT8 model

Start trinton inference server

Sample usage

Results

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages