* remove redundant apple job
openvino gpu and cpu test can share the same build and machine
Update build-rpc.yml
Update build-openvino.yml
cpu any doesnt make sense as we have an arm job already, so do high perf on both x86 and arm
remove duplicate x86 vulkan
combine backend sampling
Update server.yml
run server on arm as windows is x86
* emdawn on one machine only
* fix openvino, remove cpu tag as we dont have many x64 machines with that tag
* ci : separate CUDA windows workflow + fix names
* ci : rename workflow
* ci : prefix cache names with workflow name
* ci : rename build.yml -> build-cpu.yml
* ci : cache keys
* ci : fix windows cuda/hip concurrency of release workflow
* ci : fix apple cache names
* ci : add TODOs
* cont : keep just the last cache
* ci : update release concurrency to queue
* ci : move the release trigger to ubuntu-slim
* ci : hip add TODO
* cont : improve words
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* pi : update
* ci : fix ios build
* ci : fix andoroid
* ci : fix apple builds
* cmake : add install() for impl libraries
Add install(TARGETS <target> LIBRARY) for all -impl libraries that were
changed from STATIC to shared (controlled by BUILD_SHARED_LIBS) in
commit bb28c1fe2. Without this, cmake --install fails to copy the shared
libraries, causing runtime errors like:
llama-server: error while loading shared libraries: libllama-server-impl.so
Ref: https://github.com/ggml-org/llama.cpp/issues/23494#issuecomment-4512912515
Assisted-by: llama.cpp:local pi
* ci : fix xcframework build