mirror of
https://github.com/ikawrakow/ik_llama.cpp.git
synced 2026-06-28 04:30:15 -05:00
* Also take into account KV cache * Take into account attn_wkv_b and mla = 3 compute buffers
* Also take into account KV cache * Take into account attn_wkv_b and mla = 3 compute buffers