llama.cpp/ggml-vulkan.h at 6fd413791a754598a54a366145960f2e27eec015

mirror of https://github.com/ggml-org/llama.cpp.git synced 2025-11-04 09:32:00 +00:00

Files

0cc4m ee1628bdfe Basic Vulkan Multi-GPU implementation (#5321 )

* Initial Vulkan multi-gpu implementation

Move most global variables into backend context

* Add names to backend device functions

* Add further missing cleanup code

* Reduce code duplication in tensor split layer assignment

* generalize LLAMA_SPLIT_LAYER for all backends, do not expose device count and memory in llama.h

* Only do device info print in the beginning and initialize one backend for cpu assist

Add missing cleanup code

* Rework backend memory management to make sure devices and buffers get properly allocated and freed

* Rename cpu assist free function

---------

Co-authored-by: slaren <slarengh@gmail.com>

2024-02-07 07:54:50 +01:00

1.5 KiB

Raw Blame History

View Raw

1.5 KiB Raw Blame History

1.5 KiB

Raw Blame History