Specify the head_size from the config when importing Gemma from Hugging Face. #1148

mfuntowicz · 2024-02-23T16:30:45Z

This fix importing Gemma model from Hugging Face Hub when input and outputs sizes for the attention differs (such as 7b falvour).

The current implementation was inferring the size from the hidden_size but this leads to wrong shape being defined.

This PR fixes this by especially setting the head_size value from GemmaConfig

Needs PR #1147

byshiue · 2024-03-04T05:53:00Z

@mfuntowicz Thanks very much for your great contribution. The changes will be included in the next main branch update to the GitHub, and we will credit you as co-author. Thanks!

Specify the head_size from the config from the Hugging Face.

44405c2

mfuntowicz changed the title ~~Specify the head_size from the config from the Hugging Face.~~ Specify the head_size from the config when importing Gemma from Hugging Face. Feb 23, 2024

kaiyux mentioned this pull request Mar 5, 2024

Update TensorRT-LLM #1233

Merged

kaiyux mentioned this pull request Apr 12, 2024

Update TensorRT-LLM Release branch #1445

Merged

nv-guomingz closed this Jun 4, 2024

nv-guomingz added triaged Issue has been triaged by maintainers Merged labels Jun 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specify the head_size from the config when importing Gemma from Hugging Face. #1148

Specify the head_size from the config when importing Gemma from Hugging Face. #1148

mfuntowicz commented Feb 23, 2024

byshiue commented Mar 4, 2024

Specify the head_size from the config when importing Gemma from Hugging Face. #1148

Specify the head_size from the config when importing Gemma from Hugging Face. #1148

Conversation

mfuntowicz commented Feb 23, 2024

byshiue commented Mar 4, 2024