* Begin work on set_rows
* Work on set rows
* Add error buffers for reporting unsupported SET_ROWS indices
* Remove extra comments
* Work on templating for different types in shaders
* Work on shader type generation
* Working q4_0 mul_mat and some templating for different types
* Add q4_0_f16 matmul and fix device init
* Add matmul support for basic quantization types
* Add q2_k and q3_k quantization
* Add rest of k-quants
* Get firt i-quant working
* Closer to supporting all i-quants
* Support rest of i-quants
* Cleanup code
* Fix python formatting
* debug
* Bugfix for memset
* Add padding to end of buffers on creation
* Simplify bit-shifting
* Update usage of StringView
* Minimal setup of webgpu backend with dawn. Just prints out the adapter and segfaults
* Initialize webgpu device
* Making progress on setting up the backend
* Finish more boilerplate/utility functions
* Organize file and work on alloc buffer
* Add webgpu_context to prepare for actually running some shaders
* Work on memset and add shader loading
* Work on memset polyfill
* Implement set_tensor as webgpu WriteBuffer, remove host_buffer stubs since webgpu doesn't support it
* Implement get_tensor and buffer_clear
* Finish rest of setup
* Start work on compute graph
* Basic mat mul working
* Work on emscripten build
* Basic WebGPU backend instructions
* Use EMSCRIPTEN flag
* Work on passing ci, implement 4d tensor multiplication
* Pass thread safety test
* Implement permuting for mul_mat and cpy
* minor cleanups
* Address feedback
* Remove division by type size in cpy op
* Fix formatting and add github action workflows for vulkan and metal (m-series) webgpu backends
* Fix name
* Fix macos dawn prefix path