Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Huggingface model zoo from community #1674

Closed

Conversation

matichon-vultureprime
Copy link
Contributor

@matichon-vultureprime matichon-vultureprime commented May 25, 2024

According to [Feature Request: "Model Zoo" for quantization #1591],
this is our initial effort to create the Model Zoo.
The first model uploaded is Llama3-70b, AWQ Quantized.

I have identified several opportunities within the Model Zoo. I encountered a variety of configurations including PP_size, TP_size, KV_cache_type (fp16, fp8, int8), Group_size (64, 128), and Quantization algorithms (AWQ, SQ, FP8).
I will try to figure out the "proper" base configuration.

I have decided to use the lowest possible Group_size (to prevent the degradation of quantization) and set PP_size to 1.

Let's discuss if we can determine the "proper" configurations.

@byshiue
Copy link
Collaborator

byshiue commented May 28, 2024

Thank you for the PR. We will merge it soon.

@byshiue byshiue self-requested a review May 28, 2024 01:07
@byshiue byshiue self-assigned this May 28, 2024
@byshiue byshiue added triaged Issue has been triaged by maintainers Community want to contribute labels May 28, 2024
@nv-guomingz
Copy link
Collaborator

Hi @matichon-vultureprime ,thanks for your contributing. We've merged your contribution into code base and will add you into contributor list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community want to contribute Merged triaged Issue has been triaged by maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants