kyegomez / MGQALinks
The open source implementation of the multi grouped query attention by the paper "GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints"
☆15Updated last year
Alternatives and similar repositories for MGQA
Users that are interested in MGQA are comparing it to the libraries listed below
Sorting:
- A simple reproducible template to implement AI research papers☆24Updated 9 months ago
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆98Updated 8 months ago
- Pytorch Implementation of the paper: "Learning to (Learn at Test Time): RNNs with Expressive Hidden States"☆25Updated this week
- A repository for DenseSSMs☆87Updated last year
- Official Pytorch Implementation of Self-emerging Token Labeling☆34Updated last year
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"☆56Updated 10 months ago
- Distributed Optimization Infra for learning CLIP models☆26Updated 8 months ago
- ☆42Updated 7 months ago
- [ACL 2023] PuMer: Pruning and Merging Tokens for Efficient Vision Language Models☆30Updated 8 months ago
- Implementation of Infini-Transformer in Pytorch☆111Updated 5 months ago
- This is the official repo for ByteVideoLLM/Dynamic-VLM☆20Updated 6 months ago
- ☆50Updated last year
- ☆50Updated 5 months ago
- [ECCV 2024] FlexAttention for Efficient High-Resolution Vision-Language Models☆40Updated 5 months ago
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆95Updated this week
- EfficientVLM: Fast and Accurate Vision-Language Models via Knowledge Distillation and Modal-adaptive Pruning (ACL 2023)☆29Updated last year
- [WACV2025 Oral] DeepMIM: Deep Supervision for Masked Image Modeling☆53Updated last month
- ☆115Updated 10 months ago
- An unofficial implementation of "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆35Updated last year
- Official PyTorch implementation of "LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging" (ICML 2024)☆29Updated 10 months ago
- Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"☆101Updated 9 months ago
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"☆20Updated 2 months ago
- [NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…☆37Updated last year
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆93Updated 6 months ago
- ☆181Updated 9 months ago
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆26Updated last year
- OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation, arXiv 2024☆60Updated 4 months ago
- Evaluation and dataset construction code for the CVPR 2025 paper "Vision-Language Models Do Not Understand Negation"☆24Updated 2 months ago
- Official repository for the General Robust Image Task (GRIT) Benchmark☆54Updated 2 years ago
- Benchmarking Attention Mechanism in Vision Transformers.☆18Updated 2 years ago