Jeff Bolz
c9a24fb932
vulkan: Support FA with any multiple of 8 head sizes ( #15537 )
...
The scalar FA shader already handled multiples of 8. The coopmat1 FA
shader assumed 16x16x16 and the shared memory allocations need the HSK
dimensions padded to a multiple of 16. NVIDIA's coopmat2 implementation
requires multiples of 16 for N and K, and needs the matrix dimensions
padded and loads clamped.
Store the FA pipelines in a map, indexed by the pipeline state.
2025-08-24 11:24:25 +02:00
..
2024-03-09 14:17:11 +02:00
2025-08-14 12:03:57 +02:00
2024-01-26 14:18:00 +02:00
2024-01-26 14:18:00 +02:00
2025-05-02 20:27:13 +02:00
2025-05-20 12:03:17 +02:00
2025-01-12 11:32:42 +02:00
2025-08-24 11:24:25 +02:00
2024-11-03 19:34:08 +01:00
2025-07-03 07:48:32 +03:00
2025-05-29 12:17:16 +03:00
2025-08-23 15:21:52 +02:00
2025-08-19 10:29:36 +02:00
2024-07-12 10:46:02 +03:00
2025-04-24 16:00:10 +03:00
2025-06-01 18:08:05 +02:00
2025-05-30 16:25:45 +03:00
2025-04-24 16:00:10 +03:00
2025-04-24 16:00:10 +03:00
2025-05-25 01:48:08 +01:00
2025-05-30 16:25:45 +03:00
2025-04-24 16:00:10 +03:00
2024-10-10 22:57:42 +02:00
2025-06-30 10:17:18 +02:00
2025-01-06 10:55:18 +02:00
2025-05-04 23:43:42 +02:00
2025-08-22 23:47:01 +02:00
2025-03-10 14:07:15 +02:00
2024-11-17 08:30:29 +02:00
2025-04-30 10:44:07 +02:00
2025-05-14 19:50:57 +01:00
2024-12-14 14:43:46 +02:00
2025-05-27 12:07:52 +03:00
2025-07-30 15:12:02 +03:00
2025-01-12 11:32:42 +02:00
2024-05-05 08:07:48 +03:00
2025-06-30 10:17:18 +02:00
2025-04-24 16:00:10 +03:00
2025-04-24 16:00:10 +03:00
2025-01-12 11:32:42 +02:00
2025-06-30 10:17:18 +02:00