jaisidhsingh / pytorch-mixturesLinks
One-stop solutions for Mixture of Experts and Mixture of Depth modules in PyTorch.
☆24Updated last month
Alternatives and similar repositories for pytorch-mixtures
Users that are interested in pytorch-mixtures are comparing it to the libraries listed below
Sorting:
- ☆22Updated 5 months ago
- Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models. TMLR 2025.☆78Updated last month
- Autoregressive Image Generation☆32Updated 2 weeks ago
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆93Updated 6 months ago
- [ICLR 2025] CAMEx: Curvature-Aware Merging of Experts☆20Updated 3 months ago
- ☆42Updated 7 months ago
- Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.☆31Updated 4 months ago
- Video descriptions of research papers relating to foundation models and scaling☆31Updated 2 years ago
- ☆50Updated 5 months ago
- Implementation of the "Learn No to Say Yes Better" paper.☆31Updated last month
- This is a public repository for Image Clustering Conditioned on Text Criteria (IC|TC)☆88Updated last year
- NeuMeta transforms neural networks by allowing a single model to adapt on the fly to different sizes, generating the right weights when n…☆43Updated 7 months ago
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆73Updated last year
- The repository of paper Personalized Multimodal Response Generation with Large Language Models☆14Updated 11 months ago
- MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248☆55Updated last year
- A collection of papers on discrete diffusion models☆145Updated 2 weeks ago
- Distributed Optimization Infra for learning CLIP models☆26Updated 8 months ago
- Official repo for Detecting, Explaining, and Mitigating Memorization in Diffusion Models (ICLR 2024)☆75Updated last year
- [ICLR 2025] Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegr…☆76Updated 6 months ago
- Visual Instruction-guided Explainable Metric. Code for "Towards Explainable Metrics for Conditional Image Synthesis Evaluation" (ACL 2024…☆45Updated 7 months ago
- Implementation of Infini-Transformer in Pytorch☆111Updated 5 months ago
- ☆33Updated 3 months ago
- Official PyTorch Implementation for Paper "No More Adam: Learning Rate Scaling at Initialization is All You Need"☆52Updated 5 months ago
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆119Updated 8 months ago
- OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation, arXiv 2024☆60Updated 4 months ago
- ☆43Updated 5 months ago
- Implementation of a multimodal diffusion transformer in Pytorch☆102Updated last year
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆57Updated 6 months ago
- Official Code for Paper: Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation☆80Updated 2 weeks ago
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆82Updated last week