Nico Bosshard
e3f6fd56b1
ggml : dynamic ggml_sched_max_splits based on graph_size ( #9047 )
...
* ggml : Dynamic ggml_sched_max_splits based on graph_size
* Fixed and readded debug code for causes
2024-08-16 04:22:55 +02:00
slaren
be55695eff
ggml-backend : fix async copy from CPU ( #8897 )
...
* ggml-backend : fix async copy from CPU
* cuda : more reliable async copy, fix stream used when the devices are the same
2024-08-07 13:29:02 +02:00
slaren
2b1f616b20
ggml : reduce hash table reset cost ( #8698 )
...
* ggml : reduce hash table reset cost
* fix unreachable code warnings after GGML_ASSERT(false)
* GGML_ASSERT(false) -> GGML_ABORT("fatal error")
* GGML_ABORT use format string
2024-07-27 04:41:55 +02:00
Johannes Gäßler
a15ef8f8a0
CUDA: fix partial offloading for ne0 % 256 != 0 ( #8572 )
2024-07-18 23:48:47 +02:00
hipudding
1bdd8ae19f
[CANN] Add Ascend NPU backend ( #6035 )
...
* [CANN] Add Ascend NPU backend
Ascend is a full-stack AI computing infrastructure for industry
applications and services based on Huawei Ascend processors and
software.
CANN (Compute Architecture of Neural Networks), developped by
Huawei, is a heterogeneous computing architecture for AI.
Co-authored-by: wangshuai09 <391746016@qq.com >
* delete trailing whitespaces
* Modify the code based on review comment
* Rename LLAMA_CANN to GGML_CANN
* Make ggml-common.h private
* add ggml_cann prefix for acl funcs
* Add logging for CANN backend
* Delete Trailing whitespace
---------
Co-authored-by: wangshuai09 <391746016@qq.com >
2024-07-17 14:23:50 +03:00
Chen Xi
b549a1bbef
[SYCL] fix the mul_mat_id ut issues ( #8427 )
...
* fix part of mul_mat_id
* skip the bfloat 16 sycl ut
Signed-off-by: Chen Xi <xi2chen@intel.com >
---------
Signed-off-by: Chen Xi <xi2chen@intel.com >
Co-authored-by: Meng, Hengyu <hengyu.meng@intel.com >
Co-authored-by: Chen Xi <xi2chen@intel.com >
2024-07-12 08:52:04 +08:00
Georgi Gerganov
f3f65429c4
llama : reorganize source code + improve CMake ( #8006 )
...
* scripts : update sync [no ci]
* files : relocate [no ci]
* ci : disable kompute build [no ci]
* cmake : fixes [no ci]
* server : fix mingw build
ggml-ci
* cmake : minor [no ci]
* cmake : link math library [no ci]
* cmake : build normal ggml library (not object library) [no ci]
* cmake : fix kompute build
ggml-ci
* make,cmake : fix LLAMA_CUDA + replace GGML_CDEF_PRIVATE
ggml-ci
* move public backend headers to the public include directory (#8122 )
* move public backend headers to the public include directory
* nix test
* spm : fix metal header
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* scripts : fix sync paths [no ci]
* scripts : sync ggml-blas.h [no ci]
---------
Co-authored-by: slaren <slarengh@gmail.com >
2024-06-26 18:33:02 +03:00