Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix return_token_log_probs on vLLM > 0.3.3 endpoints #498

Merged
merged 3 commits into from
Apr 23, 2024

Conversation

saiatmakuri
Copy link
Contributor

Pull Request Summary

since vLLM > 0.3.3, logprobs are returned as as type Logprob instead as direct float's. it's an API-breaking change (see vllm-project/vllm#3065 (comment))

change vllm_server to return output in the same previous format

Test Plan and Usage Guide

create new llama-2-7b with vLLM==0.4.0.post1

Completion.create(
    model="llama-2-7b",
    prompt="Hello ",
    max_new_tokens=10,
    temperature=0.0,
    return_token_log_probs=True,
)

Copy link
Collaborator

@yunfeng-scale yunfeng-scale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the fix! i think we need to do some backfills also?

@saiatmakuri
Copy link
Contributor Author

thanks for the fix! i think we need to do some backfills also?

will update endpoints to latest image post merge

@saiatmakuri saiatmakuri merged commit 10d84ca into main Apr 23, 2024
4 of 5 checks passed
@saiatmakuri saiatmakuri deleted the saiatmakuri/fix-return_token_log_probs branch April 23, 2024 05:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants