Default Branch

f96eaddba8 · Revert DFlash SWA optimization (#2039) · Updated 2026-06-26 04:00:09 -05:00

Branches

9b783bf22e · Fix Qwen3.5/3.6 MTP and -muge · Updated 2026-05-17 08:55:20 -05:00    jdelony

161
1

fcbb17a8ea · Consistently use --mtp-requantize-output-tensor · Updated 2026-05-16 10:41:12 -05:00    jdelony

162
2

b5da743b3c · Handle interleaved types · Updated 2026-05-16 01:38:18 -05:00    jdelony

164
3

25ab82abd3 · imatrix: use data for ffn_up when data for ffn_gate is missing · Updated 2026-05-14 23:24:07 -05:00    jdelony

168
1

da4ebcc719 · Slightly better · Updated 2026-05-14 09:55:24 -05:00    jdelony

170
2

62755c24e8 · Fix ggml_nbytes · Updated 2026-05-13 09:28:06 -05:00    jdelony

176
1

67735a4587 · More MTP tweaks · Updated 2026-05-13 04:37:26 -05:00    jdelony

176
1

26591f2b57 · Cleanup · Updated 2026-05-13 00:10:41 -05:00    jdelony

181
2

68f36e7878 · Gemma4 MTP: avoid casting KV cache to f32 · Updated 2026-05-12 06:41:48 -05:00    jdelony

181
1

1d9d2b7f1b · Fix GLM-4.5 MTP loading · Updated 2026-05-12 01:56:15 -05:00    jdelony

181
1

7f29d4a670 · rebase branch with main · Updated 2026-05-11 20:53:33 -05:00    jdelony

185
1

16369dbf0f · MTP: Reuse graphs (again) · Updated 2026-05-11 10:31:15 -05:00    jdelony

185
1

b28ddd49e3 · Cleanup · Updated 2026-05-11 07:22:50 -05:00    jdelony

186
4

54262626b7 · Avoid recurrent state copy · Updated 2026-05-11 04:43:22 -05:00    jdelony

187
1

d81090541b · MTP: ebable per step recurrent state for split mode graph · Updated 2026-05-10 07:47:30 -05:00    jdelony

190
1

e7f8d7cdbd · Fix Mistral3 split mode graph · Updated 2026-05-10 00:46:40 -05:00    jdelony

190
1

f6deca0f97 · Faster per step recurrent state restore when using MTP · Updated 2026-05-09 08:31:03 -05:00    jdelony

192
1

43df4192d6 · Avoid some code duplication · Updated 2026-05-08 08:46:10 -05:00    jdelony

197
2

010da571be · Use async copies to save/restore recurrent state · Updated 2026-05-08 08:04:00 -05:00    jdelony

197
1

d0c4dd6c55 · Fix discarding tokens from the KV cache during MTP drafting · Updated 2026-05-07 23:51:59 -05:00    jdelony

198
1