10 Commits

Author SHA1 Message Date
Marian M.
5fb707d19b
Update docs (#1956)
* Update README.md

Models, MTP, fit

* Update parameters.md

Disclaimer, terms, new flags, graph split list.
2026-06-12 08:24:22 +02:00
Samuel Oliveira Alves
007d640098
Standardize speculative decoding arguments on the server (#1908)
* refactor spec args

* add shell-safe quoting of string-valued stage keys in speculative decoding
2026-06-04 15:44:57 +02:00
Samuel Oliveira Alves
f4f4b3ff26
Allow dual speculative decoding (#1789)
* wip: test logic to use multiple specs

* feat: introduce composite speculative decoding stages

* handle MTP context and draft invalidation

* fix: allow gemma mtp for speculative stages

* fix: normalize spec stage keys

* refactor: remove enable_mtp flag and improve speculative stage handling

* fix: update cached text tokens handling for stage chains

* feat: implement sync for external MTP after non-MTP accept
2026-05-15 10:10:40 +03:00
Marian M.
b2e7f7f6cd
Update docs (#1800)
* Update README.md

- New model
- New features

* Update parameters.md

- Recent new parameters
2026-05-14 08:44:58 +03:00
mcm007
5720a4131a
Update docs (#1606)
* Update parameters.md

- list sm graph architectures
- gpu tips
- build options and parameters

* Update README.md

- Gemma4
2026-04-10 18:20:28 +02:00
mcm007
d557d6c098
Update docs (#1574)
* Update README.md

- Model support
- KV cache improvements

* Update parameters.md

- KV Q4_0 improvements
- wgt, with notice
- mtmd-kq-type
2026-04-03 08:30:29 +02:00
mcm007
028fc79710
Update README.md and parameters docs (#1550)
* Update parameters.md withe recent changes

* Update README.md with recent changes

- Hadamard for V cache
- AVX-VNNI optimizations
- Auto-fit
2026-03-29 18:52:08 +02:00
Kawrakow
4b1a6560a8
Update parameters documentation with new options (#1511)
Added descriptions for `--fit` and `--fit-margin`.
2026-03-25 18:23:44 +01:00
mcm007
bfef07d10b
Update README.md and parameters.md with recent improvements (#1423)
* Improve text formatting

* Update README.md with recent models and features

* Update parameters.md with recent additions

* Remove deprecated from parameters.md
2026-03-14 18:14:20 +01:00
mcm007
b2cb4512c5
Create parameters overview (#1269)
* raw parameters.md

* fix small typos in common.cpp

* Update build args in parameters.md

* Update parameters.md

- format as table
- sections

* Update README.md

- quickstart
- build and run

* Update parameters.md

other tools examples

* add PR links

* multiple updates to parameters.md

- description
- add jargon section
- add suggestions from feedbacks

* don't imply that only linux is supported in README.md

* add alias to parameters.md

* Update README.md with recent models and features

* Update parameters.md with latest features

* address suggestions

- no-ooae
- placeholder for common commands
- no-kv-offload
- llama-sweep-bench
- placeholder for unique parameters

* specify Linux distro in README.md
2026-02-20 07:20:56 +01:00