• Joined on 2025-04-01
jdelony synced commits to ik/dflash_tweaks at jdelony/ik_llama.cpp from mirror 2026-06-28 04:30:15 -05:00
jdelony synced new reference ik/dflash_tweaks to jdelony/ik_llama.cpp from mirror 2026-06-28 04:30:15 -05:00
jdelony synced commits to ik/revert_dflash_swa_opt at jdelony/ik_llama.cpp from mirror 2026-06-28 04:30:15 -05:00
jdelony synced new reference ik/revert_dflash_swa_opt to jdelony/ik_llama.cpp from mirror 2026-06-28 04:30:15 -05:00
jdelony synced commits to main at jdelony/ik_llama.cpp from mirror 2026-06-28 04:30:15 -05:00
f96eaddba8 Revert DFlash SWA optimization (#2039)
1255b1e479 Minor DFlash tweaks (#2034)
af62a37acd Prune examples/llava. Dead code. (#2025)
c713bd599b llama : fix CPU-only load crash on a CUDA build (device_mem out-of-bounds) (#2037)
0ffdf509ab ggml : fix set_rows CPU crash when the destination is F32 (#2038)
Compare 6 commits »
jdelony synced and deleted reference refs/tags/cisc/quantize-moe-mtp-fix at jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00
jdelony synced commits to 0cc4m/vulkan-d2d-copy at jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00
jdelony synced new reference 0cc4m/vulkan-d2d-copy to jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00
jdelony synced commits to 0cc4m/vulkan-submission-threshold-flops at jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00
jdelony synced new reference 0cc4m/vulkan-submission-threshold-flops to jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00
jdelony synced commits to dev-metal at jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00
f7c1df6502 metal : per-op source split + parallel compile (#24021)
9bebfcb4bc sycl : fix failed ut cases of norm (#25044)
0b6529d818 vulkan: fix step operator for 0 input (#25036)
c299a92c38 binaries : Improve rpc-server and export-graph-ops names. (#25045)
0275c0f800 ci : add windows-openvino to check-release (#25022)
Compare 69 commits »
jdelony synced commits to gg/preserve-thinking-hint at jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00
jdelony synced new reference gg/preserve-thinking-hint to jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00
jdelony synced commits to gg/server-logs-reduce at jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00
jdelony synced new reference gg/server-logs-reduce to jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00
jdelony synced commits to gg/server-stream-clean-up at jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00
jdelony synced new reference gg/server-stream-clean-up to jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00
jdelony synced commits to master at jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00
ebd048fc5e opencl: flash attention improvement (#25069)
0ed235ea2c [CUDA] Added a cudaMemcpy2DAsync fast path to ggml_cuda_cpy (#25057)
9bebfcb4bc sycl : fix failed ut cases of norm (#25044)
0b6529d818 vulkan: fix step operator for 0 input (#25036)
c299a92c38 binaries : Improve rpc-server and export-graph-ops names. (#25045)
Compare 42 commits »
jdelony synced commits to xsn/fix_handling_spec_hf at jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00
jdelony synced new reference xsn/fix_handling_spec_hf to jdelony/llama.cpp from mirror 2026-06-27 23:50:20 -05:00