mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2025-11-12 10:47:01 +00:00
ggml WebGPU: add support for quantization types (#15440)
* Begin work on set_rows * Work on set rows * Add error buffers for reporting unsupported SET_ROWS indices * Remove extra comments * Work on templating for different types in shaders * Work on shader type generation * Working q4_0 mul_mat and some templating for different types * Add q4_0_f16 matmul and fix device init * Add matmul support for basic quantization types * Add q2_k and q3_k quantization * Add rest of k-quants * Get firt i-quant working * Closer to supporting all i-quants * Support rest of i-quants * Cleanup code * Fix python formatting * debug * Bugfix for memset * Add padding to end of buffers on creation * Simplify bit-shifting * Update usage of StringView
This commit is contained in:
@@ -20,8 +20,8 @@ add_custom_command(
|
||||
COMMAND ${CMAKE_COMMAND} -E make_directory ${SHADER_OUTPUT_DIR}
|
||||
COMMAND ${CMAKE_COMMAND} -E env PYTHONIOENCODING=utf-8
|
||||
${Python3_EXECUTABLE} ${CMAKE_CURRENT_SOURCE_DIR}/wgsl-shaders/embed_wgsl.py
|
||||
--input "${SHADER_DIR}"
|
||||
--output "${SHADER_HEADER}"
|
||||
--input_dir "${SHADER_DIR}"
|
||||
--output_file "${SHADER_HEADER}"
|
||||
DEPENDS ${WGSL_SHADER_FILES} ${CMAKE_CURRENT_SOURCE_DIR}/wgsl-shaders/embed_wgsl.py
|
||||
VERBATIM
|
||||
)
|
||||
|
||||
Reference in New Issue
Block a user