kpup1710 / CAMEx
[ICLR 2025] CAMEx: Curvature-Aware Merging of Experts
☆18Updated last month
Alternatives and similar repositories for CAMEx:
Users that are interested in CAMEx are comparing it to the libraries listed below
- [CVPR 2025] h-Edit: Effective and Flexible Diffusion-Based Editing via Doob’s h-Transform☆39Updated last week
- LibMoE: A LIBRARY FOR COMPREHENSIVE BENCHMARKING MIXTURE OF EXPERTS IN LARGE LANGUAGE MODELS☆37Updated 2 months ago
- From Implicit to Explicit Feedback: A deep neural network for modeling sequential behaviours and long-short term preferences of online us…Updated last year
- This is the public github for our paper "Transformer with a Mixture of Gaussian Keys"☆26Updated 2 years ago
- ☆21Updated 3 months ago
- Official code for the paper "Image generation with shortest path diffusion" accepted at ICML 2023.☆23Updated last year
- ☆16Updated last year
- ☆21Updated 7 months ago
- ☆57Updated 2 months ago
- [ICLR 2024] Official implementation of Bellman Optimal Stepsize Straightening of Flow-Matching Models☆35Updated last year
- Switch EMA: A Free Lunch for Better Flatness and Sharpness☆26Updated last year
- The repository of paper Personalized Multimodal Response Generation with Large Language Models☆13Updated 9 months ago
- Train vector quantized CLIP models using pytorch lightning☆18Updated 8 months ago
- [NeurIPS'24] Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization☆29Updated 6 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆37Updated 5 months ago
- Official PyTorch implementation of "Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets" (ICLR 2023 notable top 25%)☆24Updated last year
- Code for the paper "Efficient Dataset Distillation using Random Feature Approximation"☆37Updated 2 years ago
- Variance Covariance Regularization☆14Updated last year
- MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248☆51Updated 9 months ago
- PyTorch implementation of "From Sparse to Soft Mixtures of Experts"☆53Updated last year
- Official code for paper "Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models, ICML2024"☆24Updated 2 months ago
- ☆48Updated last year
- [NeurIPS 2022] Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach -- Official Implementation