Samuel Oliveira Alves 11a1fea9e2
Move embedding management to speculative (#1825)
* refactor speculative decoding with companion context and draft result structures

* feat: add common speculative feature handling in server context

* refactor: move embedings outside server

* feat: harden draft input hidden state in llama context

* remove unused functions

* refactor: streamline speculative feature handling and remove unused code

* remove redundant code

* remove more unused variables

* refactor: implement speculative feature handling
2026-05-20 17:42:48 +03:00
..
2024-07-27 07:55:01 +02:00
2025-12-15 08:27:20 +01:00
2023-11-13 14:16:23 +02:00