jdelony

jdelony synced commits to ik/dflash_tweaks at jdelony/ik_llama.cpp from mirror 2026-06-28 04:30:15 -05:00

jdelony synced new reference ik/dflash_tweaks to jdelony/ik_llama.cpp from mirror 2026-06-28 04:30:15 -05:00

jdelony synced commits to ik/revert_dflash_swa_opt at jdelony/ik_llama.cpp from mirror 2026-06-28 04:30:15 -05:00

jdelony synced new reference ik/revert_dflash_swa_opt to jdelony/ik_llama.cpp from mirror 2026-06-28 04:30:15 -05:00

jdelony synced commits to main at jdelony/ik_llama.cpp from mirror 2026-06-28 04:30:15 -05:00

f96eaddba8 Revert DFlash SWA optimization (#2039)

1255b1e479 Minor DFlash tweaks (#2034)

af62a37acd Prune examples/llava. Dead code. (#2025)

c713bd599b llama : fix CPU-only load crash on a CUDA build (device_mem out-of-bounds) (#2037)

0ffdf509ab ggml : fix set_rows CPU crash when the destination is F32 (#2038)

Compare 6 commits »

jdelony synced and deleted reference refs/tags/cisc/quantize-moe-mtp-fix at jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00

jdelony synced commits to 0cc4m/vulkan-d2d-copy at jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00

jdelony synced new reference 0cc4m/vulkan-d2d-copy to jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00

jdelony synced commits to 0cc4m/vulkan-submission-threshold-flops at jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00

jdelony synced new reference 0cc4m/vulkan-submission-threshold-flops to jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00

jdelony synced commits to dev-metal at jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00

f7c1df6502 metal : per-op source split + parallel compile (#24021)

9bebfcb4bc sycl : fix failed ut cases of norm (#25044)

0b6529d818 vulkan: fix step operator for 0 input (#25036)

c299a92c38 binaries : Improve rpc-server and export-graph-ops names. (#25045)

0275c0f800 ci : add windows-openvino to check-release (#25022)

Compare 69 commits »

jdelony synced commits to gg/preserve-thinking-hint at jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00

jdelony synced new reference gg/preserve-thinking-hint to jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00

jdelony synced commits to gg/server-logs-reduce at jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00

jdelony synced new reference gg/server-logs-reduce to jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00

jdelony synced commits to gg/server-stream-clean-up at jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00

jdelony synced new reference gg/server-stream-clean-up to jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00

jdelony synced commits to master at jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00

ebd048fc5e opencl: flash attention improvement (#25069)

0ed235ea2c [CUDA] Added a cudaMemcpy2DAsync fast path to ggml_cuda_cpy (#25057)

9bebfcb4bc sycl : fix failed ut cases of norm (#25044)

0b6529d818 vulkan: fix step operator for 0 input (#25036)

c299a92c38 binaries : Improve rpc-server and export-graph-ops names. (#25045)

Compare 42 commits »

jdelony synced commits to xsn/fix_handling_spec_hf at jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00

jdelony synced new reference xsn/fix_handling_spec_hf to jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00