Georgi Gerganov
5dcb711666
speculative : fix n_outputs_max and remove draft-simple auto-enable ( #23988 )
...
* speculative : add common_speculative_n_max helper function
Extract the speculative max-draft-size logic from server_n_outputs_max
into a reusable common_speculative_n_max() function in common/speculative.
Assisted-by: llama.cpp:local pi
* cont : draft context always has n_parallel outputs
* llama : log n_outputs_max
* speculative : remove draft-simple auto-enable
* ci : enable server tests on PRs
2026-06-01 22:26:58 +03:00
..
2026-03-23 13:21:41 +02:00
2025-05-02 20:27:13 +02:00
2026-05-27 14:52:47 +03:00
2026-05-25 10:57:43 -07:00
2026-05-28 09:44:25 +03:00
2026-06-01 06:32:17 +03:00
2026-05-26 15:21:21 +03:00
2026-05-27 14:52:47 +03:00
2026-05-25 08:11:19 +03:00
2026-05-31 19:02:47 +03:00
2026-05-26 15:21:21 +03:00
2026-05-28 09:44:25 +03:00
2026-05-30 08:52:30 +03:00
2026-05-27 14:52:47 +03:00
2026-05-28 09:44:25 +03:00
2026-05-28 09:44:25 +03:00
2026-06-01 06:32:17 +03:00
2026-05-28 09:44:25 +03:00
2026-06-01 06:32:17 +03:00
2026-05-28 09:44:25 +03:00
2026-06-01 10:39:59 +03:00
2026-05-28 09:44:25 +03:00
2026-05-11 21:38:22 +08:00
2026-06-01 06:32:17 +03:00
2026-06-01 06:32:17 +03:00
2026-05-25 08:11:19 +03:00
2026-04-14 01:18:44 +08:00
2026-05-25 08:11:19 +03:00
2026-03-25 10:55:37 +02:00
2026-05-18 22:14:45 +02:00
2026-05-25 08:11:19 +03:00
2026-05-06 14:46:14 +02:00
2026-05-28 09:44:25 +03:00
2026-01-26 15:22:49 +01:00
2026-05-25 10:41:25 +02:00
2026-05-25 08:11:19 +03:00
2026-05-25 08:11:19 +03:00
2026-05-25 08:11:19 +03:00
2026-05-30 13:21:46 +03:00
2026-05-27 14:52:47 +03:00
2026-05-27 14:52:47 +03:00
2026-06-01 22:26:58 +03:00
2026-05-28 17:50:32 +03:00
2026-05-28 17:50:32 +03:00
2026-05-28 20:58:32 +03:00
2026-05-28 17:50:32 +03:00
2026-05-27 14:52:47 +03:00
2026-05-25 08:11:19 +03:00
2026-02-17 09:30:31 +01:00