Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about past_key_value modification #82

Open
baihuajun24 opened this issue Jun 21, 2024 · 2 comments
Open

Question about past_key_value modification #82

baihuajun24 opened this issue Jun 21, 2024 · 2 comments

Comments

@baihuajun24
Copy link

Hello Eagle Team!
I noticed you modified past_key_value in

past_key_value = None

by setting it to None in forward function, comparing with the source code
https://github.com/huggingface/transformers/blob/e51d7ac70ab8f3e69d3659226aa838308a668238/src/transformers/models/llama/modeling_llama.py#L324
Could you provide some insights why you made such changes? I am trying to generating responses with code-llama-7b with EAGLE's KVLlamaForCausalLM class, but the results are much lower quality than results I got with default AutoModelForCausalLM class. I suspect the kv cache affects the generation.

@Liyuhui-12
Copy link
Collaborator

Liyuhui-12 commented Jun 28, 2024

This modification is due to the use of pre-allocated KV cache to optimize the efficiency of the base model (this part of the code refers to Medusa). In the cat operation at

key_states = past_key_value[0].cat(key_states, dim=2)
value_states = past_key_value[1].cat(value_states, dim=2)
the key and value of the current token have already been cached into past_key_value, so there is no need to return the key and value of the current token for operations outside the model. This modification itself will not affect model performance, but if you do not reset the length attribute of the KV cache after a generation, it will result in abnormal generation.

@jzzzf
Copy link

jzzzf commented Jul 3, 2024

@Liyuhui-12 Could you explain how we pass the value of past_key_value, if we set it to none? I am confused. Thank you for your help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants