llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-06-27 23:50:20 -05:00

History

server : fix crash when seq_rm fails for hybrid/recurrent models (#18391 )

* server : fix crash when seq_rm fails for hybrid/recurrent models

* server : add allow_processing param to clear_slot

2025-12-26 16:35:29 +01:00

batched-bench

tool/ex/tests: consistently free ctx, then model (#18168 )

2025-12-22 11:00:37 +01:00

cli

gen-docs: automatically update markdown file (#18294 )

2025-12-22 19:30:19 +01:00

completion

gen-docs: automatically update markdown file (#18294 )

2025-12-22 19:30:19 +01:00

cvector-generator

common : refactor common_sampler + grammar logic changes (#17937 )

2025-12-14 10:11:13 +02:00

export-lora

cmake : Do not install tools on iOS targets (#15903 )

2025-09-16 09:54:44 +07:00

fit-params

fit-params : fix race condition in fit-params output (#18276 )

2025-12-24 15:57:38 +01:00

gguf-split

cli: new CLI experience (#17824 )

2025-12-10 15:28:59 +01:00

imatrix

common : refactor common_sampler + grammar logic changes (#17937 )

2025-12-14 10:11:13 +02:00

llama-bench

tool/ex/tests: consistently free ctx, then model (#18168 )

2025-12-22 11:00:37 +01:00

mtmd

model : add ASR support for LFM2-Audio-1.5B (conformer) (#18106 )

2025-12-19 00:18:01 +01:00

perplexity

common : refactor common_sampler + grammar logic changes (#17937 )

2025-12-14 10:11:13 +02:00

quantize

cli: new CLI experience (#17824 )

2025-12-10 15:28:59 +01:00

rpc

Install rpc-server when GGML_RPC is ON. (#17149 )

2025-11-11 10:53:59 +00:00

run

Manually link -lbsd to resolve flock symbol on AIX (#16610 )

2025-10-23 19:37:31 +08:00

server

server : fix crash when seq_rm fails for hybrid/recurrent models (#18391 )

2025-12-26 16:35:29 +01:00

tokenize

cmake : Do not install tools on iOS targets (#15903 )

2025-09-16 09:54:44 +07:00

tts

common : refactor common_sampler + grammar logic changes (#17937 )

2025-12-14 10:11:13 +02:00

CMakeLists.txt

llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653 )

2025-12-15 09:24:59 +01:00