ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-06-28 04:30:15 -05:00

main

f96eaddba8 · Revert DFlash SWA optimization (#2039) · Updated 2026-06-26 04:00:09 -05:00

ik/revert_dflash_swa_opt 0440345ba9 · Revert DFlash SWA optimization · Updated 2026-06-26 03:58:50 -05:00	1 1		ZIP TAR.GZ
ik/dflash_tweaks a4e408611d · Minor DFlash tweaks · Updated 2026-06-25 10:10:16 -05:00	5 1		ZIP TAR.GZ
ik/qwen35_mtp_smgraph e1670f6c6c · Merge remote-tracking branch 'origin/main' into ik/qwen35_mtp_smgraph · Updated 2026-06-24 11:32:10 -05:00	9 7		ZIP TAR.GZ
ik/g4_assistant_smgraph 1f5828eaa4 · It is better to use llama_context pointers as keys · Updated 2026-06-24 08:53:59 -05:00	11 5		ZIP TAR.GZ
ik/tensor_names 9283af5ed8 · Avoid Gemma4 assistant strange tensor name warnings · Updated 2026-06-24 04:20:41 -05:00	11 1		ZIP TAR.GZ
fcp/checkpoint_min_var 3476dd6a40 · server: variance based checkpoint eviction · Updated 2026-06-23 21:41:08 -05:00	17 1		ZIP TAR.GZ
ik/purge_blas 3cf0f5468f · Also these · Updated 2026-06-19 10:24:24 -05:00	29 2		ZIP TAR.GZ
ik/gemma4_mtp_last_device e734b76632 · Force Gemma4 assistant to be loaded on last GPU · Updated 2026-06-19 08:51:11 -05:00	29 2		ZIP TAR.GZ
ik/gemma4_mtp_graph_reuse d1692e1951 · Allow graph reuse for Gemma4 MTP · Updated 2026-06-19 04:34:45 -05:00	29 1		ZIP TAR.GZ
ik/compat_g4_assistant 25d91dea44 · Add compatibility for llama.cpp Gemma4 assistant GGUFs · Updated 2026-06-19 02:50:26 -05:00	30 1		ZIP TAR.GZ
ik/fix_gemma4_mtp 67b0b22760 · Fix Gemma4 MTP compute graph · Updated 2026-06-18 10:51:22 -05:00	34 2		ZIP TAR.GZ
ik/glm_mtp_warmup 2c1dc8781b · Fix MTP warmup for GLM models · Updated 2026-06-18 08:15:10 -05:00	34 1		ZIP TAR.GZ
ik/fix_qwen_mtp_warmup dc81d79cb6 · Provide API to gtet the model arch string · Updated 2026-06-17 11:18:32 -05:00	39 4		ZIP TAR.GZ
ik/dflash_fix_smgraph 5b9c3bbc3b · Fix DFlash oerformance with split mode graph · Updated 2026-06-17 00:46:05 -05:00	39 1		ZIP TAR.GZ
ik/dflash_fix_cpu 6f45163a95 · Fix DFlash on the CPU · Updated 2026-06-16 08:22:36 -05:00	42 0	Included	ZIP TAR.GZ
ik/fattn_mma_gqa_16 6be3a488d3 · CUDA FA: faster TG when GQA is 16 and head size is 128 · Updated 2026-06-15 06:46:02 -05:00	63 0	Included	ZIP TAR.GZ
ik/minimaxm3_smgraph c24d50dd88 · Split mode graph for MiniMax-M3 · Updated 2026-06-15 03:41:34 -05:00	71 0	Included	ZIP TAR.GZ
ik/fix_1961 c73bfbe9ce · Fix #1961 · Updated 2026-06-14 02:42:39 -05:00	77 0	Included	ZIP TAR.GZ
ik/handle_think_no_space 175819b4fb · Style · Updated 2026-06-12 01:19:06 -05:00	85 0	Included	ZIP TAR.GZ
ik/mmq_show_error_details c622ea37d3 · More info · Updated 2026-06-11 09:06:07 -05:00	88 2		ZIP TAR.GZ

1 2 3 4 5 ...

Default Branch

Branches