fix return_token_log_probs on vLLM > 0.3.3 endpoints #498

saiatmakuri · 2024-04-23T04:07:58Z

Pull Request Summary

since vLLM > 0.3.3, logprobs are returned as as type Logprob instead as direct float's. it's an API-breaking change (see vllm-project/vllm#3065 (comment))

change vllm_server to return output in the same previous format

Test Plan and Usage Guide

create new llama-2-7b with vLLM==0.4.0.post1

Completion.create(
    model="llama-2-7b",
    prompt="Hello ",
    max_new_tokens=10,
    temperature=0.0,
    return_token_log_probs=True,
)

yunfeng-scale

thanks for the fix! i think we need to do some backfills also?

saiatmakuri · 2024-04-23T05:02:51Z

thanks for the fix! i think we need to do some backfills also?

will update endpoints to latest image post merge

saiatmakuri added 2 commits April 23, 2024 02:13

fix return_token_log_probs

cef0e87

fix fr

9c0a56b

saiatmakuri requested review from seanshi-scale and yunfeng-scale April 23, 2024 04:07

saiatmakuri self-assigned this Apr 23, 2024

undo extra change

201d735

saiatmakuri requested a review from dmchoiboi April 23, 2024 04:30

yunfeng-scale approved these changes Apr 23, 2024

View reviewed changes

saiatmakuri merged commit 10d84ca into main Apr 23, 2024
4 of 5 checks passed

saiatmakuri deleted the saiatmakuri/fix-return_token_log_probs branch April 23, 2024 05:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix return_token_log_probs on vLLM > 0.3.3 endpoints #498

fix return_token_log_probs on vLLM > 0.3.3 endpoints #498

saiatmakuri commented Apr 23, 2024

yunfeng-scale left a comment

saiatmakuri commented Apr 23, 2024

fix return_token_log_probs on vLLM > 0.3.3 endpoints #498

fix return_token_log_probs on vLLM > 0.3.3 endpoints #498

Conversation

saiatmakuri commented Apr 23, 2024

Pull Request Summary

Test Plan and Usage Guide

yunfeng-scale left a comment

Choose a reason for hiding this comment

saiatmakuri commented Apr 23, 2024