Why does EAGLE remove the input_layernorm of llama? #76

bisunny · 2024-05-26T09:43:13Z

Liyuhui-12 · 2024-06-09T08:13:44Z

The base model has a layer normalization (layernorm) layer before the LM head. Since the feature sequence has already been normalized, we do not use layer normalization.

haiduo · 2024-07-16T09:00:49Z

The base model has a layer normalization (layernorm) layer before the LM head. Since the feature sequence has already been normalized, we do not use layer normalization.

It is true that the base model has a layer normalization (layernorm) layer before the LM head, but this has nothing to do with EAGLE remove the input_layernorm of llama. I guess this is a trick to help improve Eagle accuracy?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why does EAGLE remove the input_layernorm of llama? #76

Why does EAGLE remove the input_layernorm of llama? #76

bisunny commented May 26, 2024

Liyuhui-12 commented Jun 9, 2024

haiduo commented Jul 16, 2024

Why does EAGLE remove the input_layernorm of llama? #76

Why does EAGLE remove the input_layernorm of llama? #76

Comments

bisunny commented May 26, 2024

Liyuhui-12 commented Jun 9, 2024

haiduo commented Jul 16, 2024