Kawrakow f5e5753c32
Fix Qwen35 mtp warmup (#1987)
* Use hidden state from prev token from qwen mtp

* Fix Qwen35 MTP warmup

* Cleanup + remove unnecessary crippling performance by not using accept to sample draft token

* Provide API to gtet the model arch string

---------

Co-authored-by: SamuelOliveirads <samueloliveira32df@gmail.com>
2026-06-18 09:03:40 +02:00
..
2026-06-16 13:22:36 +00:00
2026-06-18 09:03:40 +02:00
2026-06-14 21:07:57 -03:00
2026-06-02 10:22:13 -03:00
2026-06-10 07:45:49 +02:00
2026-06-10 07:45:49 +02:00
2026-06-14 21:07:57 -03:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2026-04-09 15:33:28 +02:00
2026-04-10 18:22:57 +02:00