Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Error] Static dimension mismatch while setting input shape. while running llama 3 8B #2071

Closed
2 of 4 tasks
manickavela29 opened this issue Jul 31, 2024 · 2 comments
Closed
2 of 4 tasks
Labels
bug Something isn't working

Comments

@manickavela29
Copy link

System Info

CPU architecture : 86_64
CPU/Host memory size 187GB
GPU properties
GPU name : A10
GPU memory size : 24 GB
Clock frequencies used (if applicable)
Libraries
TensorRT-LLM tag : v0.10.0

Model : Lama 3 8B

Container used :

nvcr.io/nvidia/tritonserver:24.07-trtllm-python-py3

Engine is build with below cmd:


python ../quantization/quantize.py --model_dir model/ \
                                   --output_dir tllm_checkpoint_1gpu_awq \
                                   --dtype float16 \
                                   --qformat int4_awq \
                                   --awq_block_size 128 \
                                   --tp_size 1 \
                                   --pp_size 1

trtllm-build --checkpoint_dir tllm_checkpoint_1gpu_awq \
            --output_dir trt_engines/awq/1-gpu_1/  \
            --gemm_plugin float16

Who can help?

@byshiue @kaiyux

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Docker image :

ARG BASE_IMAGE=nvcr.io/nvidia/tritonserver
ARG BASE_TAG=24.07-trtllm-python-py3
FROM ${BASE_IMAGE}:${BASE_TAG}

llama 3 8b and quantization scripts from examples of tensorrt_llm

Expected behavior

Model inferencing running sucessfully

actual behavior

Failing with the error

[07/31/2024-13:01:36] [TRT] [E] 3: [executionContext.cpp::setInputShape::2278] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::setInputShape::2278, condition: engineDims.d[i] == dims.d[i] Static dimension mismatch while setting input shape.)
[07/31/2024-13:01:36] [TRT] [E] 3: [executionContext.cpp::setInputShape::2278] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::setInputShape::2278, condition: engineDims.d[i] == dims.d[i] Static dimension mismatch while setting input shape.)
[07/31/2024-13:01:36] [TRT] [E] 3: [executionContext.cpp::setInputShape::2278] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::setInputShape::2278, condition: engineDims.d[i] == dims.d[i] Static dimension mismatch while setting input shape.)
[07/31/2024-13:01:36] [TRT] [E] 3: [executionContext.cpp::setInputShape::2278] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::setInputShape::2278, condition: engineDims.d[i] == dims.d[i] Static dimension mismatch while setting input shape.)
[07/31/2024-13:01:36] [TRT] [E] 3: [executionContext.cpp::setInputShape::2278] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::setInputShape::2278, condition: engineDims.d[i] == dims.d[i] Static dimension mismatch while setting input shape.)
[07/31/2024-13:01:36] [TRT] [E] 3: [executionContext.cpp::setInputShape::2278] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::setInputShape::2278, condition: engineDims.d[i] == dims.d[i] Static dimension mismatch while setting input shape.)
[07/31/2024-13:01:36] [TRT] [E] 3: [executionContext.cpp::setInputShape::2278] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::setInputShape::2278, condition: engineDims.d[i] == dims.d[i] Static dimension mismatch while setting input shape.)
[07/31/2024-13:01:36] [TRT] [E] 3: [executionContext.cpp::setInputShape::2278] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::setInputShape::2278, condition: engineDims.d[i] == dims.d[i] Static dimension mismatch while setting input shape.)
[07/31/2024-13:01:36] [TRT] [E] 3: [executionContext.cpp::setInputShape::2278] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::setInputShape::2278, condition: engineDims.d[i] == dims.d[i] Static dimension mismatch while setting input shape.)
[07/31/2024-13:01:36] [TRT] [E] 3: [executionContext.cpp::resolveSlots::2991] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::resolveSlots::2991, condition: allInputDimensionsSpecified(routine) )
[07/31/2024-13:01:36] [TRT] [E] 3: [executionContext.cpp::resolveSlots::2991] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::resolveSlots::2991, condition: allInputDimensionsSpecified(routine) )
2024-07-31 13:01:36,784 ERROR: Error generating text

output_gen_ids = self.decoder.decode(input_ids, input_lengths, sampling_config = self.sampling_config)

File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/runtime/generation.py", line 774, in wrapper
ret = func(self, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/runtime/generation.py", line 2944, in decode
return self.decode_regular(
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/runtime/generation.py", line 2569, in decode_regular
should_stop, next_step_tensors, tasks, context_lengths, host_context_lengths, attention_mask, logits, encoder_input_lengths = self.handle_per_step(
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/runtime/generation.py", line 2244, in handle_per_step
raise RuntimeError(f"Executing TRT engine failed step={step}!")
RuntimeError: Executing TRT engine failed step=0!

additional notes

I experimented with these configs, but they didn't make any difference

--max_input_len=4096
--max_batch_size=1
--max_num_tokens=4096

@manickavela29 manickavela29 added the bug Something isn't working label Jul 31, 2024
@Kefeng-Duan
Copy link
Collaborator

Hi, @manickavela29
Not sure what's your run script, could you help to provide it?

by the way, could you update your version to latest and try examples/run.py or examples/summarize.py at first?

@manickavela29
Copy link
Author

I just used the trtllm build engine cmd and tried running inference with samples and examples provided from trt-llm repo

I moved on to completely different tasks, not sure if I can pull this up again and verify so closing this.
Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants