Samuel Oliveira Alves be8435793e
Pre-allocate buffers for hybrid model checkpoints (#1774)
* hybrid-spec: improve recurrent checkpoint handling in speculative decoding

* change per-step save to support scheduling and asynchronous tensor operations

* remove redudant backend tensor fallback

* improve recurrent tensor handling for split graph
2026-05-12 07:21:25 +03:00
..
2024-07-27 07:55:01 +02:00
2026-05-10 07:44:20 +03:00
2026-05-06 09:25:38 +03:00
2025-12-15 08:27:20 +01:00
2026-05-06 09:25:38 +03:00
2023-11-13 14:16:23 +02:00