CatworldLee / Gaussian-Mixture-Mask-AttentionLinks

☆9

Alternatives and similar repositories for Gaussian-Mixture-Mask-Attention

Users that are interested in Gaussian-Mixture-Mask-Attention are comparing it to the libraries listed below

Sorting:

top-yun / SPARK
A benchmark dataset and simple code examples for measuring the perception and reasoning of multi-sensor Vision Language models.
☆18Updated 7 months ago
MikaStars39 / StableMask
PyTorch implementation of StableMask (ICML'24)
☆13Updated last year
zhuyunqi96 / LoraLPrun
☆13Updated 2 years ago
RAIVNLab / MatFormer-OLMo
Code repository for the public reproduction of the language modelling experiments on "MatFormer: Nested Transformer for Elastic Inference…
☆27Updated last year
OpenNLPLab / HGRN2
HGRN2: Gated Linear RNNs with State Expansion
☆55Updated 11 months ago
zaydzuhri / flame
Fork of Flame repo for training of some new stuff in development
☆14Updated 3 weeks ago
UCDvision / NOLA
Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"
☆55Updated 11 months ago
rvl-lab-utoronto / BFP
Pytorch Implementation for "Preserving Linear Separability in Continual Learning by Backward Feature Projection" (CVPR 2023)
☆18Updated 2 years ago
XavierGrool / FGFusion
☆26Updated last year
smonsays / hypernetwork-attention
Official code for the paper "Attention as a Hypernetwork"
☆40Updated last year
Eliyas0007 / Pytorch-Intention
Unofficial implementation of paper : Exploring the Space of Key-Value-Query Models with Intention
☆12Updated 2 years ago
Raincleared-Song / ConPET
Source code for a LoRA-based continual relation extraction method.
☆12Updated last year
tianyi-lab / R2-T2
[ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"
☆15Updated 5 months ago
GATECH-EIC / Linearized-LLM
[ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
☆33Updated last year
PKU-ML / non_neg
Official Code for ICLR 2024 Paper: Non-negative Contrastive Learning
☆45Updated last year
csarron / PuMer
[ACL 2023] PuMer: Pruning and Merging Tokens for Efficient Vision Language Models
☆32Updated 10 months ago
amazon-science / controllable-readability-summarization
Generating Summaries with Controllable Readability Levels (EMNLP 2023)
☆14Updated this week
JiauZhang / tracking-arxiv
微信公众号：机器感知 | Tracking the Latest Arxiv Papers
☆38Updated 2 months ago
facebookresearch / adaptive_scheduling
Experimental scripts for researching data adaptive learning rate scheduling.
☆23Updated last year
gccnlp / Light-PEFT
[ACL 2024 Findings] Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning
☆11Updated 11 months ago
DRSY / KV_Compression
[EMNLP 2023]Context Compression for Auto-regressive Transformers with Sentinel Tokens
☆25Updated last year
zehanwang01 / FreeBind
☆21Updated 3 months ago
UCSC-VLAA / Sight-Beyond-Text
[TMLR 2024] Official implementation of "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"
☆20Updated last year
NJUNLP / PATS
☆44Updated 2 months ago
LukasHedegaard / structured-pruning-adapters
Structured Pruning Adapters in PyTorch
☆19Updated last year
qiuzh20 / RMoE
Official implementation of RMoE (Layerwise Recurrent Router for Mixture-of-Experts)
☆22Updated last year
MLGroupJLU / RWKV-Survey
The official GitHub page for the survey paper "A Survey of RWKV".
☆27Updated 7 months ago
caojiaolong / Awesome-Mamba
Collect papers about Mamba (a selective state space model).
☆14Updated last year
WailordHe / DenseSSM
A repository for DenseSSMs
☆88Updated last year
BaohaoLiao / mefts
[NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning
☆31Updated 2 years ago