mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-06-28 04:30:15 -05:00
* Avoid copying the per-step SSM state (CUDA) * Avoid copying the per-step SSM state (CPU) * Allocate only what is necessary for per-step SSM state * Cleanup