kpup1710 / CAMExLinks
[ICLR 2025] CAMEx: Curvature-Aware Merging of Experts
☆20Updated 3 months ago
Alternatives and similar repositories for CAMEx
Users that are interested in CAMEx are comparing it to the libraries listed below
Sorting:
- LibMoE: A LIBRARY FOR COMPREHENSIVE BENCHMARKING MIXTURE OF EXPERTS IN LARGE LANGUAGE MODELS☆40Updated 2 weeks ago
- This is the public github for our paper "Transformer with a Mixture of Gaussian Keys"☆28Updated 2 years ago
- ☆63Updated 4 months ago
- RecGPT: Generative Pre-training for Text-based Recommendation (ACL 2024)☆33Updated 9 months ago
- From Implicit to Explicit Feedback: A deep neural network for modeling sequential behaviours and long-short term preferences of online us…☆1Updated last year
- ☆21Updated 9 months ago
- ☆16Updated 8 months ago
- [ICLR 2024] Official implementation of Bellman Optimal Stepsize Straightening of Flow-Matching Models☆35Updated last year
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆40Updated 8 months ago
- MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248☆55Updated last year
- Awesome Learn From Model Beyond Fine-Tuning: A Survey☆67Updated 6 months ago
- [CVPR 2025] h-Edit: Effective and Flexible Diffusion-Based Editing via Doob’s h-Transform☆54Updated 2 weeks ago
- [IJCAI'23] The official Github page of the paper "Diffusion Models for Non-autoregressive Text Generation: A Survey".☆31Updated last year
- [WSDM 2024] Official PyTorch Implementation of Linear Recurrent Units for Sequential Recommendation (LRURec)☆59Updated 4 months ago
- Official Code for ICLR 2024 Paper: Non-negative Contrastive Learning☆45Updated last year
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆31Updated last year
- Implementation for MomentumSMoE☆18Updated 2 months ago
- Recycling diverse models☆44Updated 2 years ago
- ScrollNet for Continual Learning☆11Updated last year
- SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining (NeurIPS 2024)☆31Updated 7 months ago
- Pioneering in Vietnamese Multimodal Large Language Model☆47Updated 5 months ago
- ☆19Updated 3 months ago
- ☆147Updated 9 months ago
- ☆22Updated 5 months ago
- A new mini-batch framework for optimal transport in deep generative models, deep domain adaptation, approximate Bayesian computation, col…☆37Updated 2 years ago
- Code for paper "Merging Multi-Task Models via Weight-Ensembling Mixture of Experts"☆24Updated last year
- ☆24Updated 2 months ago
- One-stop solutions for Mixture of Experts and Mixture of Depth modules in PyTorch.☆24Updated last month
- ☆43Updated 5 months ago
- Official code for the paper "Image generation with shortest path diffusion" accepted at ICML 2023.☆23Updated last year