mts-ai / ReplaceMeLinks
☆35Updated 5 months ago
Alternatives and similar repositories for ReplaceMe
Users that are interested in ReplaceMe are comparing it to the libraries listed below
Sorting:
- Official implementation of ECCV24 paper: POA☆24Updated last year
- The official implementation of ICLR 2025 paper "Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models".☆15Updated 6 months ago
- Official repository for ICML 2024 paper "MoRe Fine-Tuning with 10x Fewer Parameters"☆22Updated last month
- Control LLM☆20Updated 7 months ago
- Official Pytorch Implementation of Self-emerging Token Labeling☆35Updated last year
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆27Updated 3 months ago
- [COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆18Updated 7 months ago
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Updated last year
- Official PyTorch Implementation for Paper "No More Adam: Learning Rate Scaling at Initialization is All You Need"☆54Updated 9 months ago
- HGRN2: Gated Linear RNNs with State Expansion☆55Updated last year
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆101Updated last year
- Code for the paper "Cottention: Linear Transformers With Cosine Attention"☆20Updated last week
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆31Updated 6 months ago
- Official implementation of the ICML 2024 paper RoSA (Robust Adaptation)☆44Updated last year
- Official PyTorch implementation of "LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging" (ICML 2024)☆30Updated last year
- ☆19Updated 10 months ago
- Model Merging with Functional Dual Anchors☆33Updated 3 weeks ago
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆16Updated 8 months ago
- Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"☆33Updated last year
- Autoregressive Image Generation☆31Updated 5 months ago
- This is a simple torch implementation of the high performance Multi-Query Attention☆15Updated 2 years ago
- Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"☆37Updated last year
- ☆32Updated last month
- A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.☆34Updated last year
- MobileLLM-R1☆62Updated last month
- A benchmark dataset and simple code examples for measuring the perception and reasoning of multi-sensor Vision Language models.☆19Updated 10 months ago
- This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).☆32Updated last year
- Official code for the paper "Attention as a Hypernetwork"☆46Updated last year
- Pytorch Implementation of the paper: "Learning to (Learn at Test Time): RNNs with Expressive Hidden States"☆25Updated last week
- PyTorch code for "ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning"☆20Updated last year