some mixture of experts architecture implementations
☆26Mar 22, 2024Updated 2 years ago
Alternatives and similar repositories for MoE
Users that are interested in MoE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆15Apr 26, 2022Updated 3 years ago
- nanoGPT-like codebase for LLM training☆116Nov 7, 2025Updated 4 months ago
- Official respository for ECCV24 paper "Diffusion-Guided Weakly Supervised Semantic Segmentation"☆18Dec 17, 2024Updated last year
- Reimplementation of https://github.com/montemac/algebraic_value_editing in pure PyTorch for efficiency on large models☆11Jun 28, 2023Updated 2 years ago
- Provably (and non-vacuously) bounding test error of deep neural networks under distribution shift with unlabeled test data.☆10Feb 27, 2024Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆12Mar 16, 2022Updated 4 years ago
- ☆24Jan 29, 2026Updated 2 months ago
- [ICLRW'26] EoRA: Fine-tuning-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation☆30Updated this week
- Storb is a distributed storage subnet on the Bittensor network☆13Jul 28, 2025Updated 8 months ago
- PyTorch Code for the Paper: "Exploiting Uncertainty of Loss Landscape for Stochastic Optimization [Bhaskara et al. (2019)]☆16Dec 8, 2025Updated 3 months ago
- ☆19Jun 10, 2024Updated last year
- An extention to the GaLore paper, to perform Natural Gradient Descent in low rank subspace☆18Oct 21, 2024Updated last year
- ☆14Aug 28, 2019Updated 6 years ago
- Solidity library for implementing Rain compatible interpreters.☆15Updated this week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Federated Learning - PyTorch☆15Jun 27, 2021Updated 4 years ago
- Qwen3-0.6B megakernel: 527 tok/s decode on RTX 3090 (3.8x faster than PyTorch)☆83Feb 10, 2026Updated last month
- A Rust library for creating solvers in the OP Stack's dispute protocol☆19Jan 15, 2024Updated 2 years ago
- Beating OpenZeppelin's Ethernaut in Pure Assembly. Masochists Only.☆24May 9, 2023Updated 2 years ago
- ☆21Jan 23, 2024Updated 2 years ago
- Project showing how to develop NKI kernels for Llama 3.2 1B inference☆21May 29, 2025Updated 9 months ago
- Dynamic Telegram Trading Bot☆18Feb 21, 2025Updated last year
- General Matrix Multiplication using NVIDIA Tensor Cores☆28Jan 25, 2025Updated last year
- Starlight: A Kernel Optimizer for GPU Processing☆16Jan 10, 2024Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Adversarial Erasing Framework via Triplet with Gated Pyramid Pooling Layer for Weakly Supervised Semantic Segmentation, ECCV2022☆31Jul 8, 2024Updated last year
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆29Mar 1, 2024Updated 2 years ago
- An efficient implementation of learned optimizers in PyTorch☆45Dec 2, 2025Updated 3 months ago
- A collection of different ways to implement accessing and modifying internal model activations for LLMs☆20Oct 18, 2024Updated last year
- A collection of optimizers, some arcane others well known, for Flax.☆29Aug 6, 2021Updated 4 years ago
- Compression schema for gradients of activations in backward pass☆45Jul 26, 2023Updated 2 years ago
- ⛪ Sacred Compute - decentralized compute network☆39Mar 22, 2026Updated last week
- Official repository for CVPR 2023 paper: WSSS via Adversarial Learning of Classifier and Reconstructor☆29Jul 8, 2024Updated last year
- Federated learning using a mixture of experts☆17Feb 16, 2021Updated 5 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Anima Machina☆36Updated this week
- The C++ Standard Library for your entire system.☆27Mar 20, 2026Updated last week
- ☆17Dec 11, 2022Updated 3 years ago
- Parallel framework for training and fine-tuning deep neural networks☆72Nov 10, 2025Updated 4 months ago
- QuickReduce is a performant all-reduce library designed for AMD ROCm that supports inline compression.☆38Aug 29, 2025Updated 6 months ago
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Mar 22, 2024Updated 2 years ago
- ☆25Mar 15, 2023Updated 3 years ago