You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CPU architecture : 86_64
CPU/Host memory size 187GB
GPU properties
GPU name : A10
GPU memory size : 24 GB
Clock frequencies used (if applicable)
Libraries
TensorRT-LLM tag : v0.10.0 and v0.11.0
Versions of TensorRT, AMMO, CUDA, cuBLAS, etc. used
root@ip-10-40-6-105:/app# python TensorRT-LLM/examples/quantization/quantize.py --model_dir model/ \ --output_dir tllm_checkpoint_1gpu_awq
--dtype float16
--qformat int4_awq
--awq_block_size 128
~
Traceback (most recent call last):
File "/app/TensorRT-LLM/examples/quantization/quantize.py", line 5, in
from tensorrt_llm.quantization import (quantize_and_export,
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/init.py", line 32, in
import tensorrt_llm.functional as functional
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/functional.py", line 28, in
from . import graph_rewriting as gw
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/graph_rewriting.py", line 12, in
from .network import Network
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/network.py", line 26, in
from tensorrt_llm.module import Module
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/module.py", line 17, in
from ._common import default_net
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/_common.py", line 31, in
from ._utils import str_dtype_to_trt
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/_utils.py", line 29, in
from tensorrt_llm.bindings.BuildInfo import ENABLE_MULTI_DEVICE
ImportError: /usr/local/lib/python3.10/dist-packages/tensorrt_llm/bindings.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c106detail14torchCheckFailEPKcS2_jRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
Dockerfile
ARG BASE_IMAGE=nvcr.io/nvidia/tritonserver
ARG BASE_TAG=24.07-trtllm-python-py3
FROM ${BASE_IMAGE}:${BASE_TAG}
RUN git clone --recurse-submodules https://github.com/NVIDIA/TensorRT-LLM.git && cd TensorRT-LLM && git checkout tags/v0.11.0
RUN pip install torch==2.3.1 pydantic==1.10.11
RUN pip install datasets>=2.14.4 mpmath==1.3.0 rouge_score~=0.1.2 transformers_stream_generator==0.0.4 tiktoken mpmath==1.3.0
# RUN pip install -r /app/TensorRT-LLM/examples/quantization/requirements.txt
WORKDIR TensorRT-LLM/examples/llama/
COPY requirements.txt requirements-local.txt
RUN pip install -r requirements-local.txt
cmd inside docker image for quantizing llama model
System Info
CPU architecture : 86_64
CPU/Host memory size 187GB
GPU properties
GPU name : A10
GPU memory size : 24 GB
Clock frequencies used (if applicable)
Libraries
TensorRT-LLM tag : v0.10.0 and v0.11.0
Versions of TensorRT, AMMO, CUDA, cuBLAS, etc. used
Model : Lama 3 8B
Container used :
for v0.10.0
nvcr.io/nvidia/tritonserver:24.06-trtllm-python-py3
for v0.11.0
nvcr.io/nvidia/tritonserver:25.06-trtllm-python-py3
NVIDIA driver version
Running in docker
Error :
Dockerfile
cmd inside docker image for quantizing llama model
Who can help?
@Tracin @byshiue
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
Quantization script to quantize the mode
actual behavior
Failing with error
from tensorrt_llm.bindings.BuildInfo import ENABLE_MULTI_DEVICE
ImportError: /usr/local/lib/python3.10/dist-packages/tensorrt_llm/bindings.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c1021throwNullDataPtrErrorEv
additional notes
Was able to run it successfully with my local setup, only within docker image
I am facing this issue, looks like a library mismatch issue
The text was updated successfully, but these errors were encountered: