Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ModelRunnerCpp does not transfer SamplingConfig Tensor fields correctly #1183

Closed
Marks101 opened this issue Feb 28, 2024 · 7 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@Marks101
Copy link
Contributor

Hello team,
the tensorrt_llm.runtime.SamplingConfig defines multiple fields as either scalar or torch.Tensor, for example the random_seed, top_k or top_p. Inside the ModelRunnerCpp these fields are all assumed to be scalar and wrapped into lists with a single item, see here. Thus, when using the ModelRunnerCpp it is not possible to set independent values for each batch entry. If you try to do so it fails with the following error:

E           TypeError: (): incompatible function arguments. The following argument types are supported:
E               1. (self: tensorrt_llm.bindings.SamplingConfig, arg0: Optional[List[int]]) -> None
E           
E           Invoked with: <tensorrt_llm.bindings.SamplingConfig object at 0x7f5699304eb0>, [tensor([7692698082559361259,...

Looking at the the SamplingConfig Cpp code it should to be supported to provide a list with batch size entries. Accordingly it would be necessary to cast torch.Tensors to lists.

Additionally, the parameters top_p_decay, top_p_min and top_p_reset_ids are scalar in the tensorrt_llm.runtime.SamplingConfig, but are supposed to be a vector of length batch size in the Cpp implementation of the config.

It would be great if you could have a look at this and possibly fix this. Thank you!

@MartinMarciniszyn
Copy link
Collaborator

@Funatiq, could you please take a look at this? I believe that everything should be supported for this on the C++ side. Only ModelRunnerCpp needs to be revised.

@BaiMoHan
Copy link

same issue

@BaiMoHan
Copy link

The random_seed should be allowed to be set to values such as [1, 2, 3, 4] when batch_size is greater than 1.

@BaiMoHan
Copy link

same issue 

I need to fallback to a Python session to make it work, but I prefer to use a C++ session.

@Funatiq
Copy link
Collaborator

Funatiq commented Mar 14, 2024

The fix will be included in the next push to main (ETA Mar 19).

@Funatiq Funatiq closed this as completed Mar 14, 2024
@Funatiq Funatiq added the bug Something isn't working label Mar 14, 2024
@Marks101
Copy link
Contributor Author

@Funatiq @kaiyux Thank you for processing this so quickly. I took a look at it today but I still cannot pass a torch.Tensor as SamplingConfig.random_seed. The issue seems to be, that

gpt_sampling_config.random_seed = sampling_config.random_seed
lacks a tolist(). For the other fields of SamplingConfig (temperature, frequency_penalty, ...) the tolist() is there. Could you please take a look? Thank you

@Funatiq
Copy link
Collaborator

Funatiq commented Mar 21, 2024

You're right, we somehow missed that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants