ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-06-28 04:30:15 -05:00

Author	SHA1	Message	Date
Yadir Hernandez Batista	db31e7d803	Added workflow to build container images (#1279 ) * ci: implement build matrix for CUDA/CPU containers with dynamic tagging * fix: Updated Docker images/build-container.yml * fix: Updated the documentation about Docker * fix: Set Arch for 3090s * fix: Updated build step name. * fix: Set target ARCH as a variable * feat: Added cleanup step * feat: Added docker-bake and updated workflow * fix: Issue with REPO_OWNER variable * fix: Updated workflow to solve errors * fix: Updated branch format * fix: Wrong naming * Update docker-bake.hcl * Update build-container.yml * Update ik_llama-cuda.Containerfile * Update ik_llama-cpu.Containerfile * Update docker-bake.hcl * Update build-container.yml * Removed action/cache * added -sSL for reliability and fixed the URL path * added -sSL for reliability and fixed the URL path CUDA containerfile * fix: correct Dockerfile RUN command syntax errors - Combine split apt-get install commands in both Containerfiles - Fix broken cmake command continuation in ik_llama-cuda.Containerfile * fix: correct llama-swap download URL in Containerfiles - Fix broken line continuation in curl download URL for llama-swap * perf: improve ccache configuration in Containerfiles - Add CCACHE_UMASK=000 for cache accessibility across stages - Add CCACHE_MAXSIZE=1G to prevent unbounded growth - Initialize ccache with ccache -i during build stage * fix: remove problematic ccache initialization from Containerfiles - ccache -i fails because CCACHE_DIR mount doesn't exist yet at build time * fix: add git to CPU Containerfile build dependencies - Resolves CMake warning about missing Git for build info * chore: optimize Containerfile with smaller images and better healthchecks - Add --no-install-recommends to all apt-get commands for smaller image size - Add ca-certificates to base stage for HTTPS support - Merge redundant build copy commands from 3 layers to 1 - Fix llama-swap version from 198 to v199 (latest release) - Add HEALTHCHECK configuration with interval/timeout/retries to server and swap stages - Copy /app/lib in server stage to fix container startup * chore: fix CUDA Containerfile healthchecks and swap version - Add /app/lib copy in server stage to fix container startup - Fix llama-swap version from 198 to v199 (latest release) - Add HEALTHCHECK configuration with interval/timeout/retries * chore: fix indentation in Containerfiles and add LD_LIBRARY_PATH for server target * fix: add --break-system-packages flag for pip in CPU Containerfile * feat: add git bind mount for build info and NCCL support for CUDA * fix: remove libnccl-dev from CUDA build (already included in base image) * fix: added Markdown files to ignore files * feat: use BUILD_NUMBER-COMMIT pattern for docker image tags - Add BUILD_NUMBER and LLAMA_COMMIT to build workflow - Update docker-bake.hcl to use version tag format matching llama-server --version output - Format: VARIANT-BUILD_NUMBER-COMMIT (e.g., cu12-full-4406-3bc90dfd) * fix: fetch full git history for accurate BUILD_NUMBER - Add fetch-depth: 0 to actions/checkout to get all commits - This ensures git rev-list --count HEAD returns correct total commit count * fix: fetch full git history in Dockerfile for accurate BUILD_NUMBER - Add git fetch --unshallow to get complete commit history during build - This ensures build-info.cpp is generated with correct LLAMA_BUILD_NUMBER * chore: update GitHub Actions to latest versions for Node.js 24 compatibility - docker/setup-buildx-action@v3 -> v4 - docker/login-action@v3 -> v4 * chore: update all GitHub Actions to Node.js 24 compatible versions - actions/checkout@v4 -> v6 - docker/setup-buildx-action@v3 -> v5 - docker/login-action@v3 -> v6 - docker/bake-action@v5 -> v7 * fix: use CI-passed BUILD_NUMBER and LLAMA_COMMIT in Dockerfile - Add BUILD_NUMBER and LLAMA_COMMIT as build args - Fall back to git commands if not provided - Pass values explicitly to cmake for accurate build info * fix: pass BUILD_NUMBER and LLAMA_COMMIT as Docker build args - Add BUILD_NUMBER and LLAMA_COMMIT to docker bake args - These will be used by the Containerfile for accurate build info * fix: revert docker actions to v4 (latest available versions) * fix: calculate BUILD_NUMBER and LLAMA_COMMIT directly in Containerfile - Removed ARG defaults since we calculate from git during build - Use git rev-list --count HEAD and git rev-parse for accurate version info - Falls back to 0/unknown if git commands fail * feat: calculate BUILD_NUMBER and LLAMA_COMMIT in Containerfiles - Add git-based version calculation in both CPU and CUDA Containerfiles - Remove .git bind mount (git is copied with COPY .) - Pass build info to CMake for accurate llama-server --version output * feat: calculate BUILD_NUMBER and LLAMA_COMMIT in Containerfiles - Add git-based version calculation using git rev-list and git rev-parse - Copy .git directory separately to ensure git commands work during build - Pass build info to CMake for accurate llama-server --version output * fix: cache improvements for CUDA and CPU builds * fix: "/.git": not found * fix: Unnecessary mv llama-swap * fix: Remove BUILD_NUMBER and LLAMA_COMMIT from docker file, calculated by cmake proc * fix: remove .git from dockerignore for local and CI builds - Enables cmake to access .git directory during Docker build - Required for version calculation in llama-server binary - GitHub Actions uses explicit mount via bake action set parameter * fix: Remove mounts key from Build and Push step in gh workflow * ci: add .git verification step before build * refactor: standardize Containerfile structure and remove .git mount dependency - Remove --mount=type=bind,source=.git,target=.git from both Containerfiles - Replace COPY . . with git clone for cleaner build context - Add CUSTOM_COMMIT ARG for optional custom commit switching - Standardize ARG/ENV ordering and comment formatting across CPU/CUDA variants - Install ca-certificates before git clone to fix SSL verification issues - Rename 'Structured artifact collection' to 'Collect build artifacts' * ci: remove broken cache pruning step * ci: remove broken prune-cache job - Remove prune-cache job that was failing due to missing .git directory - The job required a checkout step and the cache pruning logic was non-critical * chore: Removed step for Verifying .git existance in GH workflow * fix: ensure build always proceeds even if git switch fails - Add '\|\| true' to git switch command so build continues on failure - This prevents the entire RUN step from failing when CUSTOM_COMMIT is invalid * fix: resolve Docker build pipeline issues - Remove external git clone from Containerfiles, use build context directly - Add BUILD_NUMBER and BUILD_COMMIT as CMake cache variables in build-info.cmake - Fix .devops/tools.sh inclusion by using explicit COPY for hidden directories - Set USE_CCACHE=true for CI builds - Clean up unused SHA_SHORT variable from docker-bake.hcl Fixes: Build steps were cached incorrectly due to external git clone ignoring the actual build context source. * fix: include .git in Docker build context and add verification * ci: add .git directory verification step after checkout * build: fix .git mount path for Docker build context compatibility * build: fix .git mount path for Docker build context compatibility * docker: include .git in build context for version calculation * ci: add .git directory verification step after checkout * chore: Removed unecessary Verify .git step (It was a test) * docs: update README with docker-bake and build-local.sh instructions * docs: remove build-local.sh reference (not in repo) * ci: optimize disk usage by limiting fetch depth and cleaning workspace --------- Co-authored-by: HP Prodesk <sourceupdev@gmail.com>	2026-04-10 08:06:47 +02:00
Olivier Chafik	b267b997c5	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 ) * `main`/`server`: rename to `llama` / `llama-server` for consistency w/ homebrew * server: update refs -> llama-server gitignore llama-server * server: simplify nix package * main: update refs -> llama fix examples/main ref * main/server: fix targets * update more names * Update build.yml * rm accidentally checked in bins * update straggling refs * Update .gitignore * Update server-llm.sh * main: target name -> llama-cli * Prefix all example bins w/ llama- * fix main refs * rename {main->llama}-cmake-pkg binary * prefix more cmake targets w/ llama- * add/fix gbnf-validator subfolder to cmake * sort cmake example subdirs * rm bin files * fix llama-lookup-* Makefile rules * gitignore /llama-* * rename Dockerfiles * rename llama\|main -> llama-cli; consistent RPM bin prefixes * fix some missing -cli suffixes * rename dockerfile w/ llama-cli * rename(make): llama-baby-llama * update dockerfile refs * more llama-cli(.exe) * fix test-eval-callback * rename: llama-cli-cmake-pkg(.exe) * address gbnf-validator unused fread warning (switched to C++ / ifstream) * add two missing llama- prefixes * Updating docs for eval-callback binary to use new `llama-` prefix. * Updating a few lingering doc references for rename of main to llama-cli * Updating `run-with-preset.py` to use new binary names. Updating docs around `perplexity` binary rename. * Updating documentation references for lookup-merge and export-lora * Updating two small `main` references missed earlier in the finetune docs. * Update apps.nix * update grammar/README.md w/ new llama-* names * update llama-rpc-server bin name + doc * Revert "update llama-rpc-server bin name + doc" This reverts commit e474ef1df481fd8936cd7d098e3065d7de378930. * add hot topic notice to README.md * Update README.md * Update README.md * rename gguf-split & quantize bins refs in **/tests.sh --------- Co-authored-by: HanClinto <hanclinto@gmail.com>	2024-06-13 00:41:52 +01:00
Kevin Ji	f71145524c	docker : ignore Git files (#3314 )	2023-10-02 11:53:53 +03:00
Henri Vasserman	984b7495ed	ROCm Port (#1087 ) * use hipblas based on cublas * Update Makefile for the Cuda kernels * Expand arch list and make it overrideable * Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5) * add hipBLAS to README * new build arg LLAMA_CUDA_MMQ_Y * fix half2 decomposition * Add intrinsics polyfills for AMD * AMD assembly optimized __dp4a * Allow overriding CC_TURING * use "ROCm" instead of "CUDA" * ignore all build dirs * Add Dockerfiles * fix llama-bench * fix -nommq help for non CUDA/HIP --------- Co-authored-by: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com> Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com> Co-authored-by: funnbot <22226942+funnbot@users.noreply.github.com> Co-authored-by: Engininja2 <139037756+Engininja2@users.noreply.github.com> Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com> Co-authored-by: jammm <2500920+jammm@users.noreply.github.com> Co-authored-by: jdecourval <7315817+jdecourval@users.noreply.github.com>	2023-08-25 12:09:42 +03:00
Pavol Rusnak	e4d3b4b251	Fix whitespace, add .editorconfig, add GitHub workflow (#883 )	2023-04-11 19:45:44 +00:00
Bernat Vadell	afcd16588e	🚀 Dockerize llamacpp (#132 ) * feat: dockerize llamacpp * feat: split build & runtime stages * split dockerfile into main & tools * add quantize into tool docker image * Update .devops/tools.sh Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * add docker action pipeline * change CI to publish at github docker registry * fix name runs-on macOS-latest is macos-latest (lowercase) * include docker versioned images * fix github action docker * fix docker.yml * feat: include all-in-one command tool & update readme.md --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-17 10:47:06 +01:00

6 Commits