ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-06-28 04:30:15 -05:00

History

Fix Qwen35 mtp warmup (#1987 )

* Use hidden state from prev token from qwen mtp

* Fix Qwen35 MTP warmup

* Cleanup + remove unnecessary crippling performance by not using accept to sample draft token

* Provide API to gtet the model arch string

---------

Co-authored-by: SamuelOliveirads <samueloliveira32df@gmail.com>

2026-06-18 09:03:40 +02:00

llama.h

Fix Qwen35 mtp warmup (#1987 )

2026-06-18 09:03:40 +02:00