llama : move vocab, grammar and sampling into separate files #8508

ggerganov · 2024-07-16T11:41:16Z

Some refactoring attempts, mainly trying to reorganize the llama code to prepare for #5214 and #5215

API Changes:

llama_sample_grammar -> llama_grammar_sample

Summary:

move llama_vocab to llama-vocab.h/.cpp
move tokenizer implementations from llama.cpp to llama-vocab.cpp
move llama_sample_ implementation to llama-sampling.h/.cpp
move llama_grammar_ implementation to llama-grammar.h/.cpp

TODO:

Fix Makefile header deps
Suffix all internal APIs with _impl for consistency
The reason for this change is to be able to more easily distinguish public from private calls and not rely on function overloads. For example:
```
// in llama.cpp:

// without "_impl" suffix
void llama_set_rng_seed(struct llama_context * ctx, uint32_t seed) {
    llama_set_rng_seed(&ctx->sampling, seed);
}

// with "_impl" suffix
void llama_set_rng_seed(struct llama_context * ctx, uint32_t seed) {
    llama_set_rng_seed_impl(&ctx->sampling, seed);
}
```
And the reason to have this indirection (llama.cpp:llama_set_rng_seed -> llama-samlping.cpp:llama_set_rng_seed_impl) is to decouple llama-sampling.cpp from llama_context

Conflicting PRs:

Follow-up PRs:

Change all _internal suffixes to _impl
llama : refactor sampling #8643

tests/test-grammar-integration.cpp

examples/gbnf-validator/gbnf-validator.cpp

tests/test-grammar-integration.cpp

src/llama-impl.h

oldgithubman · 2024-07-19T19:25:38Z

Some refactoring attempts, mainly trying to reorganize the llama code to prepare for #5214 and #5215

API Changes:

llama_sample_grammar -> llama_grammar_sample

Summary:

move llama_vocab to llama-vocab.h/.cpp

move tokenizer implementations from llama.cpp to llama-vocab.cpp

move llama_sample_ implementation to llama-sampling.h/.cpp

move llama_grammar_ implementation to llama-grammar.h/.cpp

TODO:

Fix Makefile header deps

Suffix all internal APIs with _impl for consistency

Are we mitigating breakages this time?

ggml-ci

ggerganov · 2024-07-22T17:08:04Z

This is a first step in partitioning llama.cpp into multiple files for easier maintenance. If this seems reasonable, we can after that do similar stuff for the KV cache, lora, state, etc.

slaren

Looks good for a first step, it will probably need more work to completely decouple the different components. Some notes:

llama_get_vocab, llama_get_sampling are unused and probably should be removed
llm_load_vocab should eventually be moved to llama-vocab.cpp
The symbols in unicode.h and unicode-data.h should have a llama_ prefix, or moved to a namespace
LLAMA_API_INTERNAL in llama.h should be removed, and tests should include the private headers instead

src/llama-impl.h

oldgithubman · 2024-07-23T01:26:52Z

This is a first step in partitioning llama.cpp into multiple files for easier maintenance. If this seems reasonable, we can after that do similar stuff for the KV cache, lora, state, etc.

I think it's reasonable if you can avoid breakages

ggerganov · 2024-07-23T07:52:36Z

LLAMA_API_INTERNAL in llama.h should be removed, and tests should include the private headers instead

I started implementing that in #8643. Will look to merge this PR in the meantime to avoid resolving bigger conflicts

…ov#8508) * llama : move sampling code into llama-sampling ggml-ci * llama : move grammar code into llama-grammar ggml-ci * cont ggml-ci * cont : pre-fetch rules * cont ggml-ci * llama : deprecate llama_sample_grammar * llama : move tokenizers into llama-vocab ggml-ci * make : update llama.cpp deps [no ci] * llama : redirect external API to internal APIs ggml-ci * llama : suffix the internal APIs with "_impl" ggml-ci * llama : clean-up

ggerganov force-pushed the gg/llama-reorganize branch 2 times, most recently from d4f8f52 to 516746a Compare July 16, 2024 11:51

github-actions bot added testing Everything test related examples labels Jul 16, 2024

ggerganov force-pushed the gg/llama-reorganize branch 2 times, most recently from db39019 to 0049b1a Compare July 16, 2024 13:40

ngxson mentioned this pull request Jul 16, 2024

llama : create llamax library #5215

Open

HanClinto reviewed Jul 16, 2024

View reviewed changes

tests/test-grammar-integration.cpp Outdated Show resolved Hide resolved

HanClinto reviewed Jul 16, 2024

View reviewed changes

examples/gbnf-validator/gbnf-validator.cpp Outdated Show resolved Hide resolved

HanClinto reviewed Jul 16, 2024

View reviewed changes

tests/test-grammar-integration.cpp Outdated Show resolved Hide resolved

ggerganov force-pushed the gg/llama-reorganize branch 3 times, most recently from da7f831 to dc96d90 Compare July 19, 2024 13:33

HanClinto reviewed Jul 19, 2024

View reviewed changes

src/llama-impl.h Outdated Show resolved Hide resolved

mofosyne added the Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level label Jul 19, 2024

ggerganov force-pushed the gg/llama-reorganize branch 2 times, most recently from ec7c6d9 to 8c5f2c2 Compare July 19, 2024 15:16

ggerganov force-pushed the gg/llama-reorganize branch from 8c5f2c2 to 0c14b04 Compare July 22, 2024 14:17

ggerganov added 9 commits July 22, 2024 19:44

llama : move sampling code into llama-sampling

0ddc8e3

ggml-ci

llama : move grammar code into llama-grammar

675f305

ggml-ci

cont

5a71d1a

ggml-ci

cont : pre-fetch rules

b4b242e

cont

689d377

ggml-ci

llama : deprecate llama_sample_grammar

e7dffa6

llama : move tokenizers into llama-vocab

8fef5b1

ggml-ci

make : update llama.cpp deps [no ci]

66ac80f

llama : redirect external API to internal APIs

39fbaf9

ggml-ci

ggerganov force-pushed the gg/llama-reorganize branch from 0c14b04 to 39fbaf9 Compare July 22, 2024 16:46

llama : suffix the internal APIs with "_impl"

dae3cae

ggml-ci

ggerganov marked this pull request as ready for review July 22, 2024 17:05

ggerganov requested a review from slaren July 22, 2024 17:05

slaren approved these changes Jul 22, 2024

View reviewed changes

src/llama-impl.h Outdated Show resolved Hide resolved

llama : clean-up

fe28a7b

ggerganov merged commit 938943c into master Jul 23, 2024
54 checks passed

ggerganov changed the title ~~llama : refactoring~~ llama : move vocab, grammar and sampling into separate files Jul 23, 2024

kaetemi mentioned this pull request Jul 24, 2024

Bug: Build error on Clang #8677

Closed

tc-wolf mentioned this pull request Aug 1, 2024

Fix crash when using grammar abetlen/llama-cpp-python#1649

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama : move vocab, grammar and sampling into separate files #8508

llama : move vocab, grammar and sampling into separate files #8508

ggerganov commented Jul 16, 2024 •

edited

Loading

oldgithubman commented Jul 19, 2024

ggerganov commented Jul 22, 2024

slaren left a comment

oldgithubman commented Jul 23, 2024

ggerganov commented Jul 23, 2024

llama : move vocab, grammar and sampling into separate files #8508

llama : move vocab, grammar and sampling into separate files #8508

Conversation

ggerganov commented Jul 16, 2024 • edited Loading

oldgithubman commented Jul 19, 2024

ggerganov commented Jul 22, 2024

slaren left a comment

Choose a reason for hiding this comment

oldgithubman commented Jul 23, 2024

ggerganov commented Jul 23, 2024

ggerganov commented Jul 16, 2024 •

edited

Loading