llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-06-27 23:50:20 -05:00

History

Load all MoE experts during warmup (#11571 )

* llama : introduce llama_set_warmup() API call that controls warmup mode; use all MoE experts during warmup

* common : use new API to enable warmup mode during model warmup

---------

Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>

2025-03-14 13:47:05 +01:00

llama-cpp.h

llama : add llama_vocab, functions -> methods, naming (#11110 )

2025-01-12 11:32:42 +02:00

llama.h

Load all MoE experts during warmup (#11571 )

2025-03-14 13:47:05 +01:00