jaisidhsingh / pytorch-mixturesLinks
One-stop solutions for Mixture of Experts and Mixture of Depth modules in PyTorch.
☆24Updated 3 months ago
Alternatives and similar repositories for pytorch-mixtures
Users that are interested in pytorch-mixtures are comparing it to the libraries listed below
Sorting:
- Sparse Autoencoders for Stable Diffusion XL models.☆69Updated last month
- [ICLR 2025] CAMEx: Curvature-Aware Merging of Experts☆22Updated 6 months ago
- Autoregressive Image Generation☆32Updated 2 months ago
- This is a PyTorch implementation of the paperViP A Differentially Private Foundation Model for Computer Vision☆36Updated 2 years ago
- Model Stock: All we need is just a few fine-tuned models☆122Updated 3 weeks ago
- Official implementation for Equivariant Architectures for Learning in Deep Weight Spaces [ICML 2023]☆89Updated 2 years ago
- Implementation of a multimodal diffusion transformer in Pytorch☆103Updated last year
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆120Updated 10 months ago
- An official PyTorch implementation for CLIPPR☆29Updated 2 years ago
- ☆23Updated 7 months ago
- This repository contains the code for our paper "Probabilistic Contrastive Learning Recovers the Correct Aleatoric Uncertainty of Ambiguo…☆41Updated 2 years ago
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆77Updated last year
- A minimal implementation of a denoising diffusion model in PyTorch.☆127Updated 9 months ago
- Official code for the paper "Image generation with shortest path diffusion" accepted at ICML 2023.☆23Updated 2 years ago
- Implementation of Infini-Transformer in Pytorch☆111Updated 7 months ago
- Video descriptions of research papers relating to foundation models and scaling☆31Updated 2 years ago
- Visualizing representations with diffusion based conditional generative model.☆97Updated 2 years ago
- Official code for the ICML 2024 paper "The Entropy Enigma: Success and Failure of Entropy Minimization"☆53Updated last year
- This repo contains the implementation of VQGAN, Taming Transformers for High-Resolution Image Synthesis in PyTorch from scratch. I have a…☆36Updated last year
- Pytorch Implementation of the sparse attention from the paper: "Generating Long Sequences with Sparse Transformers"☆86Updated 3 weeks ago
- Reproduction of DDPO paper (RLHF for diffusion)☆90Updated last year
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆57Updated 8 months ago
- Implementation of Agent Attention in Pytorch☆91Updated last year
- ☆34Updated 5 months ago
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆96Updated 8 months ago
- [IJCAI'23] The official Github page of the paper "Diffusion Models for Non-autoregressive Text Generation: A Survey".☆31Updated last year
- Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…☆86Updated this week
- Exploration into the Scaling Value Iteration Networks paper, from Schmidhuber's group☆36Updated 11 months ago
- Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models. TMLR 2025.☆93Updated 3 months ago
- Experiments for "A Closer Look at In-Context Learning under Distribution Shifts"☆19Updated 2 years ago