Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed to use "stop_words_list" for tensorrt-llm==0.9.0 #1642

Open
AGI-player opened this issue May 21, 2024 · 6 comments
Open

failed to use "stop_words_list" for tensorrt-llm==0.9.0 #1642

AGI-player opened this issue May 21, 2024 · 6 comments
Assignees
Labels
need more info stale triaged Issue has been triaged by maintainers

Comments

@AGI-player
Copy link

AGI-player commented May 21, 2024

i use GenerationExecutorWorker for web service, using the parameters stop_words_list = [["hello, yes"]] by modifying the as_inference_request function in exectutor.py as follow:
image
image
the ir parameter as follow:
image
then failed
image

@byshiue
Copy link
Collaborator

byshiue commented May 24, 2024

Please follow the issue template to share the full end to end reproduced steps. Thank you for cooperation.

@byshiue byshiue added triaged Issue has been triaged by maintainers need more info labels May 24, 2024
@byshiue byshiue self-assigned this May 24, 2024
@AGI-player
Copy link
Author

AGI-player commented May 24, 2024

Please follow the issue template to share the full end to end reproduced steps. Thank you for cooperation.

the trt engine was built with:
trtllm-build --gemm_plugin float16 --max_batch_size=128 --max_input_len=8192 --max_output_len=0 --gpt_attention_plugin float16 --paged_kv_cache enable --remove_input_padding enable --context_fmha enable --max_num_tokens 104448

then use the TensorRT-LLM/examples/apps/fastapi_server.py for web service as follow:
python3 -m apps.fastapi_server path/to/engine/ path/to/tokenizer --port 8001

for parameters setting (temperature and stop words list), i modified the TensorRT-LLM/examples/apps/fastapi_server.py file
image

and tensorrt_llm/exectutor.py file
image

if i didn't pass the stop_words_list, it works well.
image
image

It failed when i use the stop words list
image

@Superjomn
Copy link
Contributor

The stop_words_list is not supported well in 0.9.0, maybe you can try the latest main branch, we have refactored the GenerationExecutor, and the stop_words are supported.

@AGI-player
Copy link
Author

The stop_words_list is not supported well in 0.9.0, maybe you can try the latest main branch, we have refactored the GenerationExecutor, and the stop_words are supported.

I update the trt version to 0.11.0.dev2024052100,it doesn't work...

@fan-niu
Copy link

fan-niu commented Jun 3, 2024

@Superjomn @byshiue Same question, can you give an example of successful use of stop_words or stop_words_list? Thank you, I am currently using the service started by tensorrtllm_backend, the commit number is 75b0964, and the corresponding tensorrtllm version number is f430a4b

Copy link

github-actions bot commented Jul 3, 2024

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 15 days."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need more info stale triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

4 participants