CatworldLee / Gaussian-Mixture-Mask-AttentionLinks
☆9Updated 9 months ago
Alternatives and similar repositories for Gaussian-Mixture-Mask-Attention
Users that are interested in Gaussian-Mixture-Mask-Attention are comparing it to the libraries listed below
Sorting:
- A benchmark dataset and simple code examples for measuring the perception and reasoning of multi-sensor Vision Language models.☆18Updated 7 months ago
- PyTorch implementation of StableMask (ICML'24)☆13Updated last year
- ☆13Updated 2 years ago
- Code repository for the public reproduction of the language modelling experiments on "MatFormer: Nested Transformer for Elastic Inference…☆27Updated last year
- HGRN2: Gated Linear RNNs with State Expansion☆55Updated 11 months ago
- Fork of Flame repo for training of some new stuff in development☆14Updated 3 weeks ago
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"☆55Updated 11 months ago
- Pytorch Implementation for "Preserving Linear Separability in Continual Learning by Backward Feature Projection" (CVPR 2023)☆18Updated 2 years ago
- ☆26Updated last year
- Official code for the paper "Attention as a Hypernetwork"☆40Updated last year
- Unofficial implementation of paper : Exploring the Space of Key-Value-Query Models with Intention☆12Updated 2 years ago
- Source code for a LoRA-based continual relation extraction method.☆12Updated last year
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆15Updated 5 months ago
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆33Updated last year
- Official Code for ICLR 2024 Paper: Non-negative Contrastive Learning☆45Updated last year
- [ACL 2023] PuMer: Pruning and Merging Tokens for Efficient Vision Language Models☆32Updated 10 months ago
- Generating Summaries with Controllable Readability Levels (EMNLP 2023)☆14Updated this week
- 微信公众号:机器感知 | Tracking the Latest Arxiv Papers☆38Updated 2 months ago
- Experimental scripts for researching data adaptive learning rate scheduling.☆23Updated last year
- [ACL 2024 Findings] Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning☆11Updated 11 months ago
- [EMNLP 2023]Context Compression for Auto-regressive Transformers with Sentinel Tokens☆25Updated last year
- ☆21Updated 3 months ago
- [TMLR 2024] Official implementation of "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"☆20Updated last year
- ☆44Updated 2 months ago
- Structured Pruning Adapters in PyTorch☆19Updated last year
- Official implementation of RMoE (Layerwise Recurrent Router for Mixture-of-Experts)☆22Updated last year
- The official GitHub page for the survey paper "A Survey of RWKV".☆27Updated 7 months ago
- Collect papers about Mamba (a selective state space model).☆14Updated last year
- A repository for DenseSSMs☆88Updated last year
- [NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆31Updated 2 years ago