Commit Graph

8 Commits

Author SHA1 Message Date
Sigbjørn Skjæret
3ecb2f671a ggml : implement set_rows with i32 index (#16159)
* implement set_rows with i32 index

* template fix

* test quantized path

warnings--

* Apply suggestions from code review

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* forgotten name change

* deduplicate cuda/sycl and test-fix

* indent++

* vulkan: support set_rows with i32 index type (#16162)

* disable i32 index for webgpu for now

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: Jeff Bolz <jbolz@nvidia.com>
2025-09-22 19:13:00 +02:00
uvos
5ba36f6103 HIP: Cleanup hipification header (#15285)
add expicit conversion operator to support older versions of rocm
Switch over to hip_bf16 from legacy hip_bfloat16
Simplify RDNA3 define
Reduce swap over of new hipblas api to rocm 6.5 as this version is used for rocm 7.0 previews

---------

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2025-08-14 16:23:56 +02:00
R0CKSTAR
9b8f3c6c77 musa: fix build warnings (unused variable) (#14869)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-07-26 10:36:02 +08:00
Sigbjørn Skjæret
e28c0b80c2 cuda : implement bf16 cpy ops and enable bf16 cont (#14763)
* implement bf16 cpy ops and enable bf16 cont

* deduplicate copy functions

* deduplicate checks
2025-07-22 12:33:10 +02:00
Aman Gupta
f9a31eea06 CUDA: set_rows + cpy.cu refactor (#14712) 2025-07-18 14:54:18 +08:00
R0CKSTAR
cbc68be51d cuda: fix build warnings in set-rows.cu (unused variable) (#14687)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-07-15 15:28:53 +08:00
Sigbjørn Skjæret
923e3ea2e3 cuda : add set rows for bf16 (#14664) 2025-07-13 15:01:24 +02:00
Aman Gupta
7de5c7cab6 CUDA: add set rows for f32 and f16 (#14551)
* CUDA: add set rows for f32 and f16

* Review: change kernel params, use strides from host

* Use 1-d kernel

* Review: use int64_t for blockDim.x, rename nb->s for clarity
2025-07-12 16:31:38 +03:00