Samuel Oliveira Alves
557b674f63
Add llama_context to MTP ( #1601 )
...
* wip: separate llama_context for MTP with graph reuse
* wip: fix KV cache desync with separate MTP context
* refactor: remove dead mtp logic code, encapsulate KV mirroring
* mtp-context: derive args directly from the main model's context
* mtp: fix kv cache positions
* clean small comments
* minor refactor for context shift
2026-04-09 15:33:56 +02:00
..
2024-07-27 07:55:01 +02:00
2026-03-09 11:03:33 +01:00
2023-11-07 00:36:23 +03:00
2023-11-02 08:50:16 +02:00
2026-03-09 11:03:33 +01:00
2026-03-03 15:39:16 +01:00
2026-03-09 11:03:33 +01:00
2025-12-15 08:27:20 +01:00
2026-03-09 11:03:33 +01:00
2026-03-09 11:03:33 +01:00
2026-03-26 17:24:11 +01:00
2026-03-24 07:48:20 +01:00
2026-04-08 08:02:42 +02:00
2026-04-09 09:33:17 +02:00
2026-04-09 09:33:17 +02:00
2023-09-15 15:38:27 -04:00
2023-08-21 23:07:43 +03:00
2026-03-09 11:03:33 +01:00
2025-09-27 09:12:35 +02:00
2026-03-09 11:03:33 +01:00
2026-03-09 11:03:33 +01:00
2025-09-01 08:38:49 +03:00
2025-12-15 08:27:20 +01:00
2026-01-18 08:16:57 +02:00
2026-02-13 19:04:55 +01:00
2026-02-13 19:04:55 +01:00
2026-02-13 19:04:55 +01:00
2026-02-13 19:04:55 +01:00
2026-02-13 19:04:55 +01:00
2026-02-13 19:04:55 +01:00
2026-03-09 11:03:33 +01:00
2026-03-09 11:03:33 +01:00
2026-03-13 08:07:57 +01:00
2025-09-01 08:38:49 +03:00
2026-03-25 10:20:22 +01:00
2026-03-25 10:20:22 +01:00
2026-04-08 08:02:42 +02:00
2026-04-08 08:02:42 +02:00
2026-04-09 15:33:56 +02:00
2026-04-09 15:33:56 +02:00
2026-01-18 08:16:57 +02:00
2023-11-13 14:16:23 +02:00
2026-03-09 11:03:33 +01:00
2026-03-09 11:03:33 +01:00