This is a simple torch implementation of the high performance Multi-Query Attention
☆16Aug 23, 2023Updated 2 years ago
Alternatives and similar repositories for MultiQueryAttention
Users that are interested in MultiQueryAttention are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official Implementation for NorMuon paper☆66Mar 11, 2026Updated last month
- ☆16Sep 17, 2024Updated last year
- [TMLR 2025 & ICLR 2025 DeLTa] Official Implementation of Design Editing for Offline Model-based Optimization 🧬 🤖☆10Apr 17, 2025Updated last year
- The open source implementation of the multi grouped query attention by the paper "GQA: Training Generalized Multi-Query Transformer Model…☆16Dec 11, 2023Updated 2 years ago
- ☆23Mar 7, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [ICLR'25] "Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers"☆43Mar 31, 2025Updated last year
- The official implementation of ICLR 2025 paper "Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models".☆18Apr 25, 2025Updated last year
- Calculating FLOPs of Pre-trained Models in NLP☆18Mar 29, 2021Updated 5 years ago
- Benchmark for Biophysical Sequence Optimization Algorithms☆21Apr 15, 2026Updated 2 weeks ago
- [ACMMM 2022] ReCoRo: Region-Controllable Robust Light Enhancement by User-Specified Imprecise Masks☆15Feb 6, 2023Updated 3 years ago
- ☆39May 20, 2025Updated 11 months ago
- Code for "Multi-Objective GFlowNets"☆19Jul 12, 2023Updated 2 years ago
- Parallel Self-Adjusting Computation☆16Jul 5, 2021Updated 4 years ago
- ☆13Mar 13, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Experiments on Multi-Head Latent Attention☆101Aug 19, 2024Updated last year
- Image Artisan XL is the ultimate desktop application for creating amazing images with the power of artificial intelligence.☆18Apr 25, 2024Updated 2 years ago
- A swarm of LLM agents that will help you test, document, and productionize your code!☆18Updated this week
- Implementation of the paper 'Spec-VLA: Speculative Decoding for Vision-Language-Action Models with Relaxed Acceptance' (EMNLP 2025)☆28Dec 16, 2025Updated 4 months ago
- Website for CSE 234, Winter 2025☆14Mar 24, 2025Updated last year
- ☆28May 24, 2025Updated 11 months ago
- Official implementation of the paper "You Do Not Fully Utilize Transformer's Representation Capacity"☆32May 28, 2025Updated 11 months ago
- ☆45Jun 7, 2024Updated last year
- POSTECH: Compiler Construction (Spring 2022)☆11Mar 10, 2023Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆13May 11, 2023Updated 2 years ago
- 学习的A星算法教程,把代码分享给更多人。一起学习。☆16Apr 5, 2018Updated 8 years ago
- To mitigate position bias in LLMs, especially in long-context scenarios, we scale only one dimension of LLMs, reducing position bias and …☆11Jun 18, 2024Updated last year
- Normalize CJK characters in text☆14Sep 30, 2025Updated 7 months ago
- 삼각형의 실전! Triton☆16Feb 15, 2024Updated 2 years ago
- ☆11Sep 20, 2024Updated last year
- The official evaluation suite and dynamic data release for MixEval.☆11Sep 23, 2024Updated last year
- Explore how Flux Dev responds when you change the strengths of layers in the model.☆21Sep 20, 2024Updated last year
- Official PyTorch implementation of CD-MOE☆12Mar 18, 2026Updated last month
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- An innovative method designed to augment the capabilities of existing video diffusion models☆22May 10, 2024Updated last year
- ☆18Jun 9, 2024Updated last year
- Zeta implementation of a reusable and plug in and play feedforward from the paper "Exponentially Faster Language Modeling"☆16Nov 11, 2024Updated last year
- T2I-Copilot: A Training-Free Multi-Agent Text-to-Image System for Enhanced Prompt Interpretation and Interactive Generation (ICCV'25)☆51Oct 6, 2025Updated 6 months ago
- PathPiece tokenizer☆14Nov 10, 2024Updated last year
- Given an input RGB image, we generate novel viewpoints that simulate a 3D interactive experience.☆23Apr 26, 2023Updated 3 years ago
- Implementation of the model from "Faster sorting algorithms discovered using deep reinforcement learning" that discovered an all-new ult…☆11Aug 29, 2023Updated 2 years ago