ik_llama.cpp/ggml at f43a9f1cf6d9322c0a91718a05cf6dc750af489b - ik_llama.cpp - Jared's Git Server

jdelony/ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-06-28 04:30:15 -05:00

History

Joel Farthing f43a9f1cf6

Add per-byte CUDA MoE offload threshold (#1813 )

Co-authored-by: Joel Farthing <262452229+joelfarthing@users.noreply.github.com>

2026-05-19 08:35:05 +03:00

..

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

MTP: faster recurrent state restore (#1791 )

2026-05-13 11:00:24 +03:00

Add per-byte CUDA MoE offload threshold (#1813 )

2026-05-19 08:35:05 +03:00

.gitignore

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

CMakeLists.txt

ggml : default GGML_WIN_VER to 0x0A00 (Windows 10) (#1755 )

2026-05-08 13:23:04 +03:00