mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-10-27 08:21:30 +00:00
ggml : generalize quantize_fns for simpler FP16 handling (#1237)
* Generalize quantize_fns for simpler FP16 handling * Remove call to ggml_cuda_mul_mat_get_wsize * ci : disable FMA for mac os actions --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This commit is contained in:
3
.github/workflows/build.yml
vendored
3
.github/workflows/build.yml
vendored
@@ -137,9 +137,10 @@ jobs:
|
||||
- name: Build
|
||||
id: cmake_build
|
||||
run: |
|
||||
sysctl -a
|
||||
mkdir build
|
||||
cd build
|
||||
cmake -DLLAMA_AVX2=OFF ..
|
||||
cmake -DLLAMA_AVX2=OFF -DLLAMA_FMA=OFF ..
|
||||
cmake --build . --config Release
|
||||
|
||||
- name: Test
|
||||
|
||||
Reference in New Issue
Block a user