Kawrakow f5e5753c32
Fix Qwen35 mtp warmup (#1987)
* Use hidden state from prev token from qwen mtp

* Fix Qwen35 MTP warmup

* Cleanup + remove unnecessary crippling performance by not using accept to sample draft token

* Provide API to gtet the model arch string

---------

Co-authored-by: SamuelOliveirads <samueloliveira32df@gmail.com>
2026-06-18 09:03:40 +02:00
..
2024-07-27 07:55:01 +02:00
2026-06-12 06:19:06 +00:00
2026-06-04 15:43:07 +02:00
2025-12-15 08:27:20 +01:00
2026-06-14 21:07:57 -03:00
2026-06-18 09:03:40 +02:00
2023-11-13 14:16:23 +02:00