ggml-cpu: add 128-bit RVV implementation for Quantization Vector Dot (#20633)

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-06-27 23:50:20 -05:00

* ggml-cpu: add 128-bit impls for i-quants, ternary quants

* ggml-cpu: add 128-bit impls for iq2_xs, iq3_s, iq3_xxs, tq2_0

Co-authored-by: Rehan Qasim <rehan.qasim@10xengineers.ai>

* ggml-cpu: refactor; add rvv checks

---------

Co-authored-by: taimur-10x <taimur.ahmad@10xengineers.ai>
Co-authored-by: Rehan Qasim <rehan.qasim@10xengineers.ai>

This commit is contained in:

rehan-10xengineer

2026-04-16 13:15:15 +05:00

committed by

GitHub

parent 5637536517

commit 1e796eb41f

No known key found for this signature in database

GPG Key ID: B5690EEEBB952194

1 changed files with 902 additions and 70 deletions

972

ggml/src/ggml-cpu/arch/riscv/quants.c

View File

File diff suppressed because it is too large Load Diff

ggml-cpu: add 128-bit RVV implementation for Quantization Vector Dot (#20633)

972 ggml/src/ggml-cpu/arch/riscv/quants.c View File

972

ggml/src/ggml-cpu/arch/riscv/quants.c

View File