Default Branch

ebd048fc5e · opencl: flash attention improvement (#25069) · Updated 2026-06-27 17:36:06 -05:00

Branches

5a7462237e · remove duplicated init calls · Updated 2026-06-19 04:07:38 -05:00

106
18

37db4fa4be · improve test · Updated 2026-06-17 10:42:56 -05:00

145
2

42874dfd8f · clean up logging and timing · Updated 2026-06-17 06:47:53 -05:00

216
6

fcff47bcb1 · Merge branch 'master' into add-long-debug-prompt · Updated 2026-06-15 10:05:22 -05:00

172
2

911b67a603 · update erroneous case in PEG parser test · Updated 2026-06-15 08:14:52 -05:00

175
2

7c7be0fbc3 · cont : fix cur_buf_size init after flushing a buffer · Updated 2026-06-15 08:02:50 -05:00

174
2

9ede367e6b · Revert "Purge trailing spaces from grammar generation" · Updated 2026-06-14 17:17:53 -05:00

191
3

4c4a3e2596 · Some renames to make @CISC happy :> · Updated 2026-06-14 12:52:23 -05:00

196
2

3518061868 · fit : wrap llama_device_memory_data · Updated 2026-06-12 10:12:24 -05:00

217
1

41f049a840 · Revert "speculative : fix "ngram-map-k4v" name in logging (#24253)" · Updated 2026-06-10 02:31:42 -05:00

241
1

6c2cbc4e33 · vulkan: disable FA mask_opt on GCN to improve performance · Updated 2026-06-09 08:40:07 -05:00

247
1

b6cf9cd8fe · mtmd, llama: shared backend sched · Updated 2026-06-09 08:34:17 -05:00

247
1

9eb4e9dbb7 · nits · Updated 2026-06-08 15:13:07 -05:00

261
3

22634e0eee · Add tensor name to JSON output · Updated 2026-06-06 15:33:01 -05:00

285
14

37c56c245e · wip · Updated 2026-06-06 08:30:41 -05:00

299
14

9cc707ad7b · vulkan: fix check results async upload issue · Updated 2026-05-31 02:52:49 -05:00

389
1

926b94a1bc · server: allow API calls to set a lower thinking budget if a global budget is set · Updated 2026-05-30 01:51:06 -05:00

402
1

24c307d261 · server: create checkpoint on task cancel · Updated 2026-05-28 16:19:01 -05:00

427
1

f3ba33ec35 · address feedback · Updated 2026-05-25 01:52:58 -05:00    jdelony

522
2

baac31998a · cont : another try · Updated 2026-05-25 01:14:48 -05:00    jdelony

519
3