Xuan-Son Nguyen 506bb6e010
model: try to improve Qwen3 Next (#18683)
* qwen3next: simplify qkvz projection

* use ggml_swiglu_split

* revert swiglu_split, but remove redundant repeat()

* fix missing reshape

* rm 2 redundant transposes

* move mul_mat(k,q) to outside of chunking

* rm redundant cont

* improve g_cs_chunk

* add comments about no cont

* use std::pair instead of ggml_concat

* vectorize key_gdiff calculation

* rm unused tensor

* avoid ggml_concat inside loop

* bring back ggml_concat as it may not work on other backend

* nits
2026-01-11 12:53:33 +01:00
..
2025-10-31 23:40:23 +01:00
2025-10-31 23:40:23 +01:00
2025-10-31 23:40:23 +01:00
2025-10-31 23:40:23 +01:00
2025-10-31 23:40:23 +01:00
2025-10-31 23:40:23 +01:00
2025-10-31 23:40:23 +01:00
2025-10-31 23:40:23 +01:00
2025-10-31 23:40:23 +01:00
2025-10-31 23:40:23 +01:00
2025-10-31 23:40:23 +01:00
2025-10-31 23:40:23 +01:00
2025-10-31 23:40:23 +01:00
2025-11-27 16:04:29 +02:00
2025-10-31 23:40:23 +01:00
2025-10-31 23:40:23 +01:00
2025-10-31 23:40:23 +01:00
2025-10-31 23:40:23 +01:00
2025-12-28 17:28:31 +01:00
2025-12-15 18:51:43 +01:00
2025-10-31 23:40:23 +01:00
2025-10-31 23:40:23 +01:00
2025-10-31 23:40:23 +01:00