Integrating contributions from NomicAI for integation to LLama.cpp and GPT4ALL #352

axsaucedo · 2024-02-04T10:23:05Z

@cebtenzzre creating an initial draft PR for integration of contributinos from NomicAI for integation to LLama.cpp and GPT4ALL as part of nomic-ai/gpt4all#1852.

This PR can serve as an opportunity to dive further into some of the requirements, as well as alignment to the newer updates that have been added since, including the extension of the cmake build system.

Changes included in this PR:

TODO

… tensor object.

Fixes nomic-ai/gpt4all#1722

We don't actually need it.

libkompute.a, ShaderOpMult.hpp, and ShaderLogisticRegression.hpp are not needed at runtime by GPT4All, so don't install them. Signed-off-by: Jared Van Bortel <[email protected]>

Signed-off-by: crydsch <[email protected]> (cherry picked from commit 2a9f82a)

Signed-off-by: Jared Van Bortel <[email protected]>

This allows Vulkan instances to be reused, since a program may wish to switch devices, but the NVIDIA driver eventually fails to create Vulkan instances if you call VkFreeInstance/VkDestroyInstance too many times. Signed-off-by: Jared Van Bortel <[email protected]>

Sequences tend to reference tensors, so one cleanup pass is not enough. clear() is most useful if it frees all resources on the device, especially because we can only call VkCreateDevice/VkDestroyDevice so many times before the NVIDIA driver can no longer create devices. Signed-off-by: Jared Van Bortel <[email protected]>

*Technically*, our mat*mat shaders require some level of shader float16 support. Leaving this requirement disabled allows this code to work on more GPUs without issue, but the validation layers report an error in this case. With this change, GPUs that actually claim to support shaderFloat16 should no longer report a validation error. Signed-off-by: Jared Van Bortel <[email protected]>

Signed-off-by: Jared Van Bortel <[email protected]>

manyoso and others added 9 commits July 21, 2023 12:18

Allow pre-allocated memory from vulkan for staging host.

d3ad3aa

Fix build.

746ff8a

Fix build again.

ae5f122

Use a global cache for the pipeline.

747aab9

Allow to set tensors.

5db4a58

Major refactor to kompute allowing controlling buffers outside of the…

7ac0862

… tensor object.

sync changes from nomic-ai/llama.cpp

2d0a8ab

fix -Wunused-private-field warnings from clang

ed2ce32

Fixes nomic-ai/gpt4all#1722

manager: do not request shaderFloat16

d1e3b09

We don't actually need it.

axsaucedo self-assigned this Feb 4, 2024

axsaucedo mentioned this pull request Feb 4, 2024

Kompute Project contributions and support nomic-ai/gpt4all#1852

Open

axsaucedo marked this pull request as draft February 4, 2024 10:29

cebtenzzre and others added 7 commits May 8, 2024 11:22

cmake: don't install anything

c339310

libkompute.a, ShaderOpMult.hpp, and ShaderLogisticRegression.hpp are not needed at runtime by GPT4All, so don't install them. Signed-off-by: Jared Van Bortel <[email protected]>

Fix Sequence::clear()

7adef49

Signed-off-by: crydsch <[email protected]> (cherry picked from commit 2a9f82a)

plug a few memory leaks

a850b84

Signed-off-by: Jared Van Bortel <[email protected]>

cmake: don't search for fmt if the target already exists

aa57dff

Signed-off-by: Jared Van Bortel <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrating contributions from NomicAI for integation to LLama.cpp and GPT4ALL #352

Integrating contributions from NomicAI for integation to LLama.cpp and GPT4ALL #352

axsaucedo commented Feb 4, 2024

Integrating contributions from NomicAI for integation to LLama.cpp and GPT4ALL #352

Are you sure you want to change the base?

Integrating contributions from NomicAI for integation to LLama.cpp and GPT4ALL #352

Conversation

axsaucedo commented Feb 4, 2024