jaisidhsingh / pytorch-mixtures
One-stop solutions for Mixture of Experts and Mixture of Depth modules in PyTorch.
☆22Updated 3 weeks ago
Alternatives and similar repositories for pytorch-mixtures
Users that are interested in pytorch-mixtures are comparing it to the libraries listed below
Sorting:
- ☆22Updated 4 months ago
- Implementation of the "Learn No to Say Yes Better" paper.☆31Updated last week
- Transmute AI Lab Model Efficiency Toolkit☆19Updated last year
- The official repo of continuous speculative decoding☆26Updated last month
- ☆10Updated last year
- Sparse Autoencoders for Stable Diffusion XL models.☆59Updated last month
- This is a PyTorch implementation of the paperViP A Differentially Private Foundation Model for Computer Vision☆36Updated last year
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆91Updated 5 months ago
- ☆28Updated 3 months ago
- Minimal Implementation of Visual Autoregressive Modelling (VAR)☆33Updated last month
- Official PyTorch Implementation of "Rosetta Neurons: Mining the Common Units in a Model Zoo"☆30Updated last year
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆26Updated 6 months ago
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆72Updated last year
- NeuMeta transforms neural networks by allowing a single model to adapt on the fly to different sizes, generating the right weights when n…☆42Updated 6 months ago
- Official code for infimm-hd☆16Updated 8 months ago
- Censored Sampling of Diffusion Models Using 3 Minutes of Human Feedback☆26Updated last year
- Official PyTorch Implementation for Paper "No More Adam: Learning Rate Scaling at Initialization is All You Need"☆51Updated 3 months ago
- Repository for ACM India Summer School on Generative AI for Text☆12Updated 10 months ago
- Model Merging with SVD to Tie the KnOTS [ICLR 2025]☆54Updated last month
- ☆41Updated 6 months ago
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"☆54Updated 8 months ago
- Synthetic Alphabet Dataset☆18Updated last month
- Repository for the paper: "TiC-CLIP: Continual Training of CLIP Models".☆102Updated 11 months ago
- Latest Weight Averaging (NeurIPS HITY 2022)☆30Updated last year
- Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"☆101Updated 8 months ago
- Code release for paper Extremely Simple Activation Shaping for Out-of-Distribution Detection☆52Updated 8 months ago
- An official PyTorch implementation for CLIPPR☆29Updated last year
- [ICLR 2025] CAMEx: Curvature-Aware Merging of Experts☆19Updated 2 months ago
- Switch EMA: A Free Lunch for Better Flatness and Sharpness☆26Updated last year
- ☆85Updated last year