mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-10-27 08:21:30 +00:00
This fixes some failures on Turing where "round to zero" rounds to the max f16 value but the CPU reference value is infinite.
This fixes some failures on Turing where "round to zero" rounds to the max f16 value but the CPU reference value is infinite.