Skip to content

Latest commit

 

History

History
95 lines (89 loc) · 1.93 KB

README.md

File metadata and controls

95 lines (89 loc) · 1.93 KB

binary-mlc-llm-libs

Model libraries are stored in the format:

{model_name}/{model_name}-{quantization}-{metadata}-{platform}.{suffix}

Metadata:

  • ctx: context window size
  • sw: sliding window size
  • cs: prefill chunk size

For default configurations of metadata, we do not include that in the file name. We also do not include prefill chunk size if it is the same as the context window size or sliding window size (the default choice).

Default Metadata

Context Window Size Sliding Window Size Prefill Chunk Size
Llama-3-8b-Instruct 8192 N/A 1024
Llama-3-70b-Instruct 8192 N/A 1024
Llama-2-7b-chat-hf 4096 N/A 4096
Llama-2-13b-chat-hf 4096 N/A 4096
Llama-2-70b-chat-hf 4096 N/A 4096
Mistral-7B-Instruct-v0.2 N/A 4096 4096
RedPajama-INCITE-Chat-3B-v1 2048 N/A 2048
phi-2 2048 N/A 2048
phi-1_5 2048 N/A 2048
gpt2 1024 N/A 1024
gpt2-medium 1024 N/A 1024