-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Add]: TensorRT Support #14
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @gpastal24 , thanks for the great contribution!!
The code looks good to me in general (still have to try running it though).
There are a few minor things to fix before merging:
- there is a lot of commented code that seems old/debug code, it should be removed
- some docstrings of the main functions are missing
__pycache__
files shouldn't be committed- I would put all the utils from yolov5 in a subfolder of utils, e.g.
utils/yolov5
- I would split the export.py file in two parts:
utils/yolov5/export.py
with all the functions that have remained untouched from the original yolov5 reposcripts/export_model.py
with the actual functions that have changed to work with SimplerHRNet, e.g. includingmain
,parse_opt
,run
- other comments part of the review
I will create a PR to update your PR in the next few days. Then I'll ask your review and if everything looks good, we can merge it into your PR and then yours into the master branch.
Alternatively, please feel free to make changes to your PR directly to address my comments! I can review the changes when they're ready.
SimpleHigherHRNet.py
Outdated
@@ -30,7 +106,8 @@ def __init__(self, | |||
filter_redundant_poses=True, | |||
max_nof_people=30, | |||
max_batch_size=32, | |||
device=torch.device("cpu")): | |||
device=torch.device("cpu"), | |||
trt_=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use a more explicit name, e.g. enable_tensorrt
.
The variable should also be added to the docstring.
SimpleHigherHRNet.py
Outdated
# import pycuda.driver as cuda | ||
# self.model = TrtModel('pose_higher_hrnet_w32_512.engine') | ||
if device.type == 'cpu': | ||
device = torch.device('cuda:0') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be
device = torch.device('cuda:0') | |
raise ValueError('TensorRT does not support cpu device.') |
scripts/live-demo.py
Outdated
@@ -167,6 +173,7 @@ def main(camera_id, filename, hrnet_c, hrnet_j, hrnet_weights, hrnet_joints_set, | |||
"set to `cuda:IDS` to use one or more specific GPUs " | |||
"(e.g. `cuda:0` `cuda:1,2`); " | |||
"set to `cpu` to run on cpu.", type=str, default=None) | |||
parser.add_argument("--trt_",action='store_true') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, more explicit name and param description.
type=str, default="./weights/pose_higher_hrnet_w32_512.pth") | ||
type=str, default="./pose_higher_hrnet_w32_512.pth") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default weight path should not change
misc/utils.py
Outdated
class TRTModule_hrnet(torch.nn.Module): | ||
def __init__(self, engine=None, input_names=None, output_names=None, input_flattener=None, output_flattener=None,path=None,device=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing docstring for the class/init method
export.py
Outdated
parser.add_argument('--data', type=str, default=ROOT / 'data/coco128.yaml', help='dataset.yaml path') | ||
parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'weights/pose_higher_hrnet_w32_512.pth', help='model.pt path(s)') | ||
parser.add_argument('--imgsz', '--img', '--img-size', nargs='+', type=int, default=[512, 960], help='image (h, w)') | ||
parser.add_argument('--batch-size', type=int, default=1, help='batch size') | ||
parser.add_argument('--device', default='cpu', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') | ||
parser.add_argument('--half', action='store_true', help='FP16 half-precision export') | ||
parser.add_argument('--inplace', action='store_true', help='set YOLOv5 Detect() inplace=True') | ||
parser.add_argument('--keras', action='store_true', help='TF: use Keras') | ||
parser.add_argument('--optimize', action='store_true', help='TorchScript: optimize for mobile') | ||
parser.add_argument('--int8', action='store_true', help='CoreML/TF INT8 quantization') | ||
parser.add_argument('--dynamic', action='store_true', help='ONNX/TF/TensorRT: dynamic axes') | ||
parser.add_argument('--simplify', action='store_true', help='ONNX: simplify model') | ||
parser.add_argument('--opset', type=int, default=12, help='ONNX: opset version') | ||
parser.add_argument('--verbose', action='store_true', help='TensorRT: verbose log') | ||
parser.add_argument('--workspace', type=int, default=1, help='TensorRT: workspace size (GB)') | ||
parser.add_argument('--nms', action='store_true', help='TF: add NMS to model') | ||
parser.add_argument('--agnostic-nms', action='store_true', help='TF: add agnostic NMS to model') | ||
parser.add_argument('--topk-per-class', type=int, default=100, help='TF.js NMS: topk per class to keep') | ||
parser.add_argument('--topk-all', type=int, default=100, help='TF.js NMS: topk for all classes to keep') | ||
parser.add_argument('--iou-thres', type=float, default=0.45, help='TF.js NMS: IoU threshold') | ||
parser.add_argument('--conf-thres', type=float, default=0.25, help='TF.js NMS: confidence threshold') | ||
parser.add_argument( | ||
'--include', | ||
nargs='+', | ||
default=['torchscript'], | ||
help='torchscript, onnx, openvino, engine, coreml, saved_model, pb, tflite, edgetpu, tfjs, paddle') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all these params needed? Some of them seem to be relevant to yolov5 only
Hi, thanks for your time. I think most of your comments have been addressed.
Trt 7 produces better results for me, but maybe this has to do with different cudnn versions between torch and trt. This should produce the fp16 engine.
And for testing.
|
Hi @gpastal24 , I tested your code on Google Colab with CUDA 11/TensorRT 8.5 and it worked smoothly! Thanks a lot for the great contribution! ❤️ The code has still some minor style issues so I'm going to merge it into a new branch 'tensorrt', make minor changes to integrate/align it with the rest of the repo and then port the commits into the master branch. |
* [Add]: TensorRT Support (#14) * [Add]: TensorRT Support * [Fix]: Script improvement * [Fix]: Script improvement * [Fix]: Address the requested changes Co-authored-by: Gpastal24 <[email protected]> * Remove pycache folders * Add .gitignore * Clean up code, Update docstrings and comments * Rename tensorrt script * Clean up export script, Add HigherHRNet params * Move TensorRT utils in separate file, Minor clean up * Rename tensorrt script * Add Google Colab notebook * Update README.md * Update notebook Co-authored-by: gpastal24 <[email protected]> Co-authored-by: Gpastal24 <[email protected]>
Adding tensorrt support for HigherHRNet