Yiwei Shao
ee051c1e4e
hexagon: support for IQ4_NL and MXFP4 (#21018)
* ggml-hexagon: add IQ4_NL and MXFP4 HMX matmul support
- Add IQ4_NL quantization type support to Hexagon backend (buffer
set/get tensor repack, mul_mat, mul_mat_id dispatch)
- Implement HVX IQ4_NL vec_dot kernels (1x1, 2x1, 2x2) with
LUT-based 4-bit index to int8 kvalue dequantization
- Add MXFP4 HMX dequantization path with E8M0 scale conversion,
including batch-4 fast path and single-tile fallback
- Unify quantized row size / scale offset logic to handle Q4_0,
Q8_0, IQ4_NL, and MXFP4 in the DMA fetch path
* ggml-hexagon: fix SKIP_QUANTIZE src1 address mismatch in mixed-quant models
* Fix the pragma indent
2026-03-27 09:22:41 -07:00
..
2026-03-18 01:16:49 +08:00
2026-03-23 15:24:06 +08:00
2026-03-26 13:08:41 +02:00
2026-03-26 23:06:33 +01:00
2026-03-27 09:22:41 -07:00
2026-03-22 11:05:51 +01:00
2026-03-27 09:05:21 +02:00
2026-03-22 11:05:51 +01:00
2026-03-26 08:52:21 -07:00
2026-03-23 08:05:37 +02:00
2026-03-27 10:59:35 +02:00
2026-03-25 11:48:37 +02:00
2026-02-26 20:00:57 +08:00
2026-03-21 05:22:51 +01:00
2026-03-19 08:45:28 -07:00
2026-01-22 01:16:21 +01:00
2026-02-27 08:43:41 +08:00
2026-03-14 07:56:55 +02:00
2026-02-16 17:43:34 +02:00
2026-01-29 12:33:21 -08:00
2026-01-29 12:33:21 -08:00
2026-01-09 05:34:56 +08:00
2026-03-14 07:56:55 +02:00
2026-03-12 21:04:13 +02:00
2026-03-11 21:02:54 +01:00
2026-03-25 12:53:16 +02:00
2025-08-14 12:03:57 +02:00
2026-03-15 10:47:28 +02:00
2026-03-11 21:02:54 +01:00
2024-11-14 18:04:35 +01:00
2024-12-12 19:02:49 +01:00
2026-03-25 19:57:40 +01:00
2025-06-01 13:43:57 +03:00
2026-03-25 12:53:16 +02:00