Commit Graph

10 Commits

Author SHA1 Message Date
Francis Couture-Harpin
985cda6c7b test-model-random : add Mamba2 2025-07-07 21:07:46 -04:00
Francis Couture-Harpin
7c3f9c226f Merge branch 'master' into compilade/test-model-random 2025-06-26 17:23:16 -04:00
Francis Couture-Harpin
ccb2bb9988 test-model-random : show max error 2025-06-18 15:11:23 -04:00
Francis Couture-Harpin
9d873d7543 test-model-random : shuffle across sequences but not within
There isn't really a use-case for fully-shuffled batches

* test-model-random : use F32 as the KV cache type

Temporary until F16 is fixed on ARM when using FP16_VECTOR_ARITHMETIC
2025-06-18 15:07:24 -04:00
Francis Couture-Harpin
04b8f5143d Merge branch 'master' into compilade/test-model-random 2025-06-16 21:45:48 -04:00
Francis Couture-Harpin
352703b08b test-model-random : better default tensor initialization distribution 2025-06-16 21:37:45 -04:00
Francis Couture-Harpin
dfa3c18266 tests : add LLAMA, LLAMA4, and GEMMA2 to test-model-random 2025-06-13 20:02:47 -04:00
Francis Couture-Harpin
8fe213af76 tests : avoid sprintf in test-model-random 2025-06-12 02:48:11 -04:00
Francis Couture-Harpin
7657835b33 tests : fix overflow and memory leaks in test-model-random
* tests : fix integer types in test-model-random
2025-06-12 02:41:36 -04:00
Francis Couture-Harpin
9cd402cbe1 tests : add test-model-random
This generates random models and then tests different concurrencies
of batches to check if the output is consistent.

This can detect when e.g. the recurrent cache has been broken,
or anything else which would affect the consistency of the output
when inferencing multiple distinct sequences.

More architectures will be added, but for now this starts with Mamba.

Eventually, consistency of pooled embeddings will also be tested.

The goal is to reduce accidental regressions
by making it easy to quickly test a lot of edge cases
on the supported architectures,
without having to download any model.
2025-06-12 01:00:57 -04:00