Default Branch

f96eaddba8 · Revert DFlash SWA optimization (#2039) · Updated 2026-06-26 04:00:09 -05:00

Branches

349455d1f8 · Improve Q4_0 and Q8_0 performance on AVX2/Zen4 · Updated 2024-09-14 05:19:53 -05:00    jdelony

4666
3426

cb369c22dd · Some tweaks for iq2_k and iq3_k · Updated 2024-09-13 11:50:46 -05:00    jdelony

4666
3426

ebc88e5d9a · Fix bug and D < 128 case for Q8_0 k-cache · Updated 2024-09-12 23:17:24 -05:00    jdelony

4666
3423

27fa27daf9 · Disallow mixing bf16 with other types for kv caches · Updated 2024-09-12 10:55:13 -05:00    jdelony

4666
3430

95fe6923ad · Fix Zen4 · Updated 2024-09-11 11:44:34 -05:00    jdelony

4666
3424

d063007d24 · Delete commented out stuff · Updated 2024-09-11 01:50:45 -05:00    jdelony

4666
3423

e3919f5f80 · Fix ARM_NEON · Updated 2024-09-10 11:14:59 -05:00    jdelony

4666
3422

65555e504c · iq2_tn: slightly better performance on AVX2 · Updated 2024-09-10 05:54:20 -05:00    jdelony

4666
3418

8cb5e74e26 · iq2_tn: reuse iq2_bn implementation (Zen4) · Updated 2024-09-10 02:39:09 -05:00    jdelony

4666
3418

7d8e49ef1b · Some cleanup · Updated 2024-09-10 01:08:19 -05:00    jdelony

4666
3418

a9b15ed82e · Delete forgotten TODO · Updated 2024-09-09 12:10:11 -05:00    jdelony

4666
3419

237a2380ee · Remove unnecessary barrier in ggml_compute_forward_mul_mat · Updated 2024-09-09 04:53:23 -05:00    jdelony

4666
3423

b7f7eede8a · iq2_tn: slightly faster PP · Updated 2024-09-08 04:26:43 -05:00    jdelony

4666
3414

d2225010b9 · Fused rms_norm WIP · Updated 2024-09-08 00:06:38 -05:00    jdelony

4666
3418

8d47523e7e · Improve TG speed (when not memory bound) · Updated 2024-09-04 23:47:19 -05:00    jdelony

4666
3414

c624232525 · Zen4 Flash Attnetion: improving bf16 · Updated 2024-09-04 09:44:29 -05:00    jdelony

4666
3414

9d3460446d · WIP: trying to improve legacy quants · Updated 2024-09-03 23:21:38 -05:00    jdelony

4666
3411

fffd040281 · Delete unused stuff · Updated 2024-09-03 05:12:33 -05:00    jdelony

4666
3413

de91911d7a · Fix Zen4 Flash Attention · Updated 2024-09-02 07:51:10 -05:00    jdelony

4666
3408

6bc273c1d6 · Do not process prompts containing binary data for escapes · Updated 2024-09-02 01:12:08 -05:00    jdelony

4666
3407