mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2026-06-27 23:50:20 -05:00
server: refactor batch construction (#24843)
* server: refactor batch construction * wip * wip 2 * wip 3 * wip 4 * add abort_all_slots * handle batch full more carefully * fix assert * rm debug log * small nits * (debug) add timings * debug: force llama_synchronize for accurate timings * address comments * disable DEBUG_TIMINGS
This commit is contained in:
parent
0d135df48c
commit
bddfd2b113
File diff suppressed because it is too large
Load Diff
Loading…
x
Reference in New Issue
Block a user