Default Branch

ebd048fc5e · opencl: flash attention improvement (#25069) · Updated 2026-06-27 17:36:06 -05:00

Branches

3e26ea607d · cont : CMN_ -> COM_ · Updated 2026-06-27 10:27:42 -05:00

2
4

8cb25f997d · server : hint preserve_thinking when supported by chat template · Updated 2026-06-27 09:45:41 -05:00

1
3

0aac18be29 · use memcpy for small copies across host-visible memory · Updated 2026-06-27 05:59:58 -05:00

2
9

f7c1df6502 · metal : per-op source split + parallel compile (#24021) · Updated 2026-06-27 04:15:51 -05:00

2
1

c35f33b0a1 · fix missing mparams.hf_file · Updated 2026-06-26 06:17:04 -05:00

17
2

5004859421 · server-stream : pimpl · Updated 2026-06-26 03:36:15 -05:00

19
1

81313a35ae · type check for get_arr_int · Updated 2026-06-25 11:54:57 -05:00

31
4

2e4cbade70 · Merge branch 'master' into xsn/mtmd_ds_ocr_tiles · Updated 2026-06-25 09:28:50 -05:00

31
10

68ed5149fb · bring back examples, add mtmd · Updated 2026-06-25 08:23:03 -05:00

65
2

bf05250df9 · use unsigned ints · Updated 2026-06-25 08:02:50 -05:00

34
3

3199d5357c · chat: harden caps check · Updated 2026-06-24 08:16:42 -05:00

48
1

a14f8d2ed5 · fix test case · Updated 2026-06-24 06:38:25 -05:00

48
3

ef687feb42 · common: remove unused json-partial · Updated 2026-06-24 05:49:42 -05:00

49
1

a432e6f863 · use destructor instead · Updated 2026-06-23 15:57:20 -05:00

58
10

095058ca19 · add arg --threads-sampling · Updated 2026-06-22 13:03:49 -05:00

67
4

1b82e9ae51 · fix windows · Updated 2026-06-22 09:20:56 -05:00

69
7

037397792a · vulkan: split ggml-vulkan.cpp file · Updated 2026-06-22 08:50:01 -05:00

70
1

7ac864bf97 · disable DEBUG_TIMINGS · Updated 2026-06-21 06:38:09 -05:00

88
15

f1ef61fb1b · server: add "verbose" field to schema · Updated 2026-06-21 04:16:06 -05:00

82
1

447b0c3646 · poc: threadpool sampling · Updated 2026-06-20 15:08:42 -05:00

88
14