Default Branch

ebd048fc5e · opencl: flash attention improvement (#25069) · Updated 2026-06-27 17:36:06 -05:00

Branches

d273bfd2c9 · allocator: cleanup, more comments · Updated 2023-07-22 08:05:24 -05:00    jdelony

8985
21

d45c1631bc · metal : rewrite to fit new backend interface correctly (WIP) · Updated 2023-07-20 14:51:19 -05:00    jdelony

8985
18

0492363137 · mpi : fix after master merge · Updated 2023-07-09 14:23:04 -05:00    jdelony

9016
21

26cc1bd7a2 · llama : uniform variable names + struct init · Updated 2023-07-05 15:22:17 -05:00    jdelony

9033
4

ff6e39f138 · use javascript generators as much cleaner API · Updated 2023-07-05 14:03:01 -05:00    jdelony

9046
20

f46db27ea0 · ci : disable FMA on Mac OS · Updated 2023-07-05 10:29:08 -05:00    jdelony

9043
5

5cc672a9a5 · metal : try to utilize more of the shared memory using smaller views · Updated 2023-06-26 14:23:04 -05:00    jdelony

9080
1

78fafcaf10 · ggml : do not use _GNU_SOURCE gratuitously · Updated 2023-06-25 09:21:02 -05:00    jdelony

9088
1

20054a38c1 · Fix directory name · Updated 2023-05-26 18:00:08 -05:00    jdelony

9238
1

a1cdd29cd2 · ggml : rms_norm in chunks · Updated 2023-05-20 02:15:54 -05:00    jdelony

9259
2

95dc4d7270 · Merge 'origin/master' into steering · Updated 2023-05-19 15:19:57 -05:00    jdelony

9261
9

40ec4882c4 · ggml : use F16C conversion when available · Updated 2023-05-17 12:05:51 -05:00    jdelony

9270
1

a3e6d62283 · cuda : alternative q4_q8 kernel · Updated 2023-05-12 09:02:39 -05:00    jdelony

9304
8

e116eb638c · ggml : speed-up Q5_0 + Q5_1 at 4 threads · Updated 2023-05-11 10:51:56 -05:00    jdelony

9306
20

4baa85633a · Fix build · Updated 2023-05-06 20:44:07 -05:00    jdelony

9314
5

31ff9e2e83 · ci : add cublas to windows release · Updated 2023-05-03 16:21:20 -05:00    jdelony

9329
1

102cd98074 · ggml : Q4_3c using 2x "Full range" approach · Updated 2023-04-23 06:56:44 -05:00    jdelony

9410
8

71e6ae3779 · ggml : continue from #729 (wip) · Updated 2023-04-22 10:49:07 -05:00    jdelony

9410
7

a0242a833c · Minor, plus rebase on master · Updated 2023-04-22 09:07:10 -05:00    jdelony

9410
2

4b8d5e3890 · llama : quantize attention results · Updated 2023-04-22 03:35:13 -05:00    jdelony

9415
1