server: refactor batch construction (#24843)

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-06-27 23:50:20 -05:00

* server: refactor batch construction

* wip

* wip 2

* wip 3

* wip 4

* add abort_all_slots

* handle batch full more carefully

* fix assert

* rm debug log

* small nits

* (debug) add timings

* debug: force llama_synchronize for accurate timings

* address comments

* disable DEBUG_TIMINGS

This commit is contained in:

Xuan-Son Nguyen

2026-06-21 14:16:11 +02:00

committed by

GitHub

parent 0d135df48c

commit bddfd2b113

No known key found for this signature in database

GPG Key ID: B5690EEEBB952194

1 changed files with 583 additions and 351 deletions

934

tools/server/server-context.cpp

View File

File diff suppressed because it is too large Load Diff

server: refactor batch construction (#24843)

934 tools/server/server-context.cpp View File

934

tools/server/server-context.cpp

View File