This is a simple torch implementation of the high performance Multi-Query Attention
☆16Aug 23, 2023Updated 2 years ago
Alternatives and similar repositories for MultiQueryAttention
Users that are interested in MultiQueryAttention are comparing it to the libraries listed below
Sorting:
- ☆19Feb 2, 2026Updated last month
- Official Implementation for NorMuon paper☆61Mar 11, 2026Updated last week
- ☆16Sep 17, 2024Updated last year
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆46Jul 17, 2024Updated last year
- The open source implementation of the multi grouped query attention by the paper "GQA: Training Generalized Multi-Query Transformer Model…☆15Dec 11, 2023Updated 2 years ago
- ☆23Mar 7, 2025Updated last year
- [ICLR'25] "Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers"☆41Mar 31, 2025Updated 11 months ago
- The official implementation of ICLR 2025 paper "Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models".☆18Apr 25, 2025Updated 10 months ago
- This repository is the official implementation of Bidirectional Learning for Offline Infinite-width Model-based Optimization (NeurIPS 202…☆14Jan 19, 2023Updated 3 years ago
- Spatial Spectral Machine Learning☆14Oct 15, 2025Updated 5 months ago
- Calculating FLOPs of Pre-trained Models in NLP☆18Mar 29, 2021Updated 4 years ago
- ☆39May 20, 2025Updated 10 months ago
- ☆13Mar 13, 2023Updated 3 years ago
- ☆19Jun 13, 2024Updated last year
- PyTorch Implementation: Code for the paper "Generalizing to Unseen Domains via Adversarial Data Augmentation", NeurIPS 2018. Origin Tenso…☆14Sep 17, 2020Updated 5 years ago
- Model implementation for the contextual embeddings project☆42Jun 2, 2025Updated 9 months ago
- Official Code for Guided Trajectory Generation with Diffusion Models for Offline Model-based Optimization (NIPS 2024)☆22Aug 15, 2024Updated last year
- Experiments on Multi-Head Latent Attention☆100Aug 19, 2024Updated last year
- Image Artisan XL is the ultimate desktop application for creating amazing images with the power of artificial intelligence.☆18Apr 25, 2024Updated last year
- verl: Volcano Engine Reinforcement Learning for LLMs☆38Jun 23, 2025Updated 8 months ago
- 📰 Must-read papers on Diffusion Models for Text Generation 🔥☆19Jun 21, 2024Updated last year
- Implementation of the paper 'Spec-VLA: Speculative Decoding for Vision-Language-Action Models with Relaxed Acceptance' (EMNLP 2025)☆28Dec 16, 2025Updated 3 months ago
- Website for CSE 234, Winter 2025☆13Mar 24, 2025Updated 11 months ago
- ☆24Feb 16, 2022Updated 4 years ago
- ☆28May 24, 2025Updated 9 months ago
- [ICLR 2025] Official Implementation of ParetoFlow: Guided Flows in Multi-Objective Optimization🧬🧬🧬☆29Mar 3, 2025Updated last year
- Repository for the DPP'23 course☆11May 2, 2024Updated last year
- This repository provides a framework to serve LLM(Large Language Model) based applications such as Chatbot.☆18Apr 20, 2023Updated 2 years ago
- To mitigate position bias in LLMs, especially in long-context scenarios, we scale only one dimension of LLMs, reducing position bias and …☆11Jun 18, 2024Updated last year
- Normalize CJK characters in text☆14Sep 30, 2025Updated 5 months ago
- pytorch版simcse无监督语义相似模型☆23May 13, 2021Updated 4 years ago
- The codes of our paper "ObjectAdd: Adding Objects into Image via a Training-Free Diffusion Modification Fashion"☆14Jun 29, 2025Updated 8 months ago
- 삼각형의 실전! Triton☆16Feb 15, 2024Updated 2 years ago
- ☆11Sep 20, 2024Updated last year
- Vietnamese GPT-J API service deployed with Docker & Helm chart☆10Dec 11, 2022Updated 3 years ago
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Nov 11, 2024Updated last year
- Hal Daume's hbc☆20Jan 23, 2010Updated 16 years ago
- Code for EMNLP2020 paper: "Tell Me How to Ask Again: Question Data Augmentation with Controllable Rewriting in Continuous Space"☆26May 10, 2021Updated 4 years ago
- The official evaluation suite and dynamic data release for MixEval.☆11Sep 23, 2024Updated last year