mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-06-28 04:30:15 -05:00
* Use hidden state from prev token from qwen mtp * Fix Qwen35 MTP warmup * Cleanup + remove unnecessary crippling performance by not using accept to sample draft token * Provide API to gtet the model arch string --------- Co-authored-by: SamuelOliveirads <samueloliveira32df@gmail.com>