Control frequency - completion #277

Stealthwriter · 2023-09-14T11:47:55Z

Hi,

Is there a way to change the frequency_penality or logit bias when sending a completion request?

yixu34 · 2023-09-18T18:30:11Z

Hi @Stealthwriter , thanks for reaching out. Yes, you can basically route any API changes through to the underlying inference framework(s) we use, assuming they support the fields you need. For instance, we currently support https://github.com/scaleapi/open-tgi (forked from text-generation-inference v0.9.4) and vLLM. Would you like to try making the change yourself?

Stealthwriter · 2023-09-18T18:48:44Z

How does llm engine differs from TGI and VLLM?

yixu34 · 2023-09-18T21:17:56Z

You can think of LLM Engine as adding 1) a set of higher-level abstractions (e.g. APIs are expressed in terms of Completions and Fine-tunes) and 2) autoscaling via k8s. TGI and vLLM are great but you have to bring your own scaling.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Control frequency - completion #277

Control frequency - completion #277

Stealthwriter commented Sep 14, 2023

yixu34 commented Sep 18, 2023

Stealthwriter commented Sep 18, 2023

yixu34 commented Sep 18, 2023

Control frequency - completion #277

Control frequency - completion #277

Comments

Stealthwriter commented Sep 14, 2023

yixu34 commented Sep 18, 2023

Stealthwriter commented Sep 18, 2023

yixu34 commented Sep 18, 2023