Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while running any tensorrt-llm script, issue from cython bindings #2062

Closed
2 of 4 tasks
manickavela29 opened this issue Jul 31, 2024 · 2 comments
Closed
2 of 4 tasks
Labels
bug Something isn't working

Comments

@manickavela29
Copy link

manickavela29 commented Jul 31, 2024

System Info

CPU architecture : 86_64
CPU/Host memory size 187GB
GPU properties
GPU name : A10
GPU memory size : 24 GB
Clock frequencies used (if applicable)
Libraries
TensorRT-LLM tag : v0.10.0 and v0.11.0
Versions of TensorRT, AMMO, CUDA, cuBLAS, etc. used

Model : Lama 3 8B

Container used :

  • for v0.10.0

  • nvcr.io/nvidia/tritonserver:24.06-trtllm-python-py3

  • for v0.11.0

  • nvcr.io/nvidia/tritonserver:25.06-trtllm-python-py3

NVIDIA driver version
Running in docker

Error :

root@ip-10-40-6-105:/app# python TensorRT-LLM/examples/quantization/quantize.py --model_dir model/ \ --output_dir tllm_checkpoint_1gpu_awq
--dtype float16
--qformat int4_awq
--awq_block_size 128
~
Traceback (most recent call last):
File "/app/TensorRT-LLM/examples/quantization/quantize.py", line 5, in
from tensorrt_llm.quantization import (quantize_and_export,
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/init.py", line 32, in
import tensorrt_llm.functional as functional
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/functional.py", line 28, in
from . import graph_rewriting as gw
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/graph_rewriting.py", line 12, in
from .network import Network
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/network.py", line 26, in
from tensorrt_llm.module import Module
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/module.py", line 17, in
from ._common import default_net
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/_common.py", line 31, in
from ._utils import str_dtype_to_trt
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/_utils.py", line 29, in
from tensorrt_llm.bindings.BuildInfo import ENABLE_MULTI_DEVICE
ImportError: /usr/local/lib/python3.10/dist-packages/tensorrt_llm/bindings.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c106detail14torchCheckFailEPKcS2_jRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

Dockerfile

ARG BASE_IMAGE=nvcr.io/nvidia/tritonserver
ARG BASE_TAG=24.07-trtllm-python-py3
FROM ${BASE_IMAGE}:${BASE_TAG}

RUN git clone --recurse-submodules https://github.com/NVIDIA/TensorRT-LLM.git && cd TensorRT-LLM && git checkout tags/v0.11.0
RUN pip install torch==2.3.1 pydantic==1.10.11
RUN pip install datasets>=2.14.4 mpmath==1.3.0 rouge_score~=0.1.2 transformers_stream_generator==0.0.4 tiktoken mpmath==1.3.0

# RUN pip install -r /app/TensorRT-LLM/examples/quantization/requirements.txt
WORKDIR TensorRT-LLM/examples/llama/

COPY requirements.txt requirements-local.txt

RUN pip install -r requirements-local.txt

cmd inside docker image for quantizing llama model

 python TensorRT-LLM/examples/quantization/quantize.py --model_dir model/ \                                                                                             --output_dir tllm_checkpoint_1gpu_awq \
                                   --dtype float16 \
                                   --qformat int4_awq \
                                   --awq_block_size 128

Who can help?

@Tracin @byshiue

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

  1. Docker file from given NGC containers
  2. mount llama3 8B model
  3. Run the quantization script after cloning Tensorrt-llm

Expected behavior

Quantization script to quantize the mode

actual behavior

Failing with error
from tensorrt_llm.bindings.BuildInfo import ENABLE_MULTI_DEVICE
ImportError: /usr/local/lib/python3.10/dist-packages/tensorrt_llm/bindings.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c1021throwNullDataPtrErrorEv

additional notes

Was able to run it successfully with my local setup, only within docker image
I am facing this issue, looks like a library mismatch issue

@manickavela29 manickavela29 added the bug Something isn't working label Jul 31, 2024
@Kefeng-Duan
Copy link
Collaborator

Hi, @manickavela29 could you rebuild and reinstall trtllm?
https://nvidia.github.io/TensorRT-LLM/installation/build-from-source-linux.html#build-tensorrt-llm

@manickavela29
Copy link
Author

I moved on to completely different tasks, not sure if I can pull this up again and verify so closing this.
Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants