Nexes the Elder 9eaf86a7c7
Fix minor CUDA discrepencies (#2005)
* CUDA : typo

* CUDA: Add missing GGML_CALL to function definition

* CUDA: only log GGML_CUDA_FORCE_MMQ/CUBLAS when enabled

* CUDA: Fix softcap bug in flash_attn_tile_ext_f16

The else branch (softcap != 0) incorrectly called launch_fattn_tile_f16_64_128
with use_softcap=false instead of true, causing logit softcap to be silently
ignored for the col_per_block=32, parallel_blocks=1 path.
2026-06-23 09:37:48 +02:00
..
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2024-07-27 07:55:01 +02:00
2025-07-15 08:03:13 +02:00
2026-06-22 16:36:34 +02:00