☆22Apr 17, 2025Updated last year
Alternatives and similar repositories for dkernel
Users that are interested in dkernel are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Using FlexAttention to compute attention with different masking patterns☆47Sep 22, 2024Updated last year
- KV cache compression for high-throughput LLM inference☆158Feb 5, 2025Updated last year
- Beyond KV Caching: Shared Attention for Efficient LLMs☆20Jul 19, 2024Updated last year
- ☆15Nov 23, 2023Updated 2 years ago