tilde-research/momoe-release

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/tilde-research/momoe-release)

tilde-research / momoe-release

Memory optimized Mixture of Experts

☆78

Alternatives and similar repositories for momoe-release

Users that are interested in momoe-release are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

tilde-research / nsa-release
View on GitHub
An efficient implementation of the NSA (Native Sparse Attention) kernel
☆133Jun 24, 2025Updated last year
tilde-research / activault
View on GitHub
Engine for collecting, uploading, and downloading model activations
☆30Apr 2, 2025Updated last year
RadicalNumerics / spear
View on GitHub
Structured Primitives for Efficient Architecture Research
☆20Dec 22, 2025Updated 6 months ago
lodestone-rock / integrate-the-integrator
View on GitHub
a simple exploratory repo for low step flow model
☆17Aug 7, 2025Updated 11 months ago
lcy-seso / DLFrameworkTest
View on GitHub
My tests and experiments with some popular dl frameworks.
☆17Sep 11, 2025Updated 10 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
microsoft / ArchScale
View on GitHub
Simple & Scalable Pretraining for Neural Architecture Research
☆336Mar 31, 2026Updated 3 months ago
tilde-research / aurora-release
View on GitHub
Aurora optimizer release
☆150Updated this week
goodevening13 / aquakv
View on GitHub
☆21Apr 27, 2026Updated 2 months ago
XuezheMax / NeuroNLP
View on GitHub
Deep neural models for core NLP tasks
☆13Nov 9, 2017Updated 8 years ago
eth-easl / mixtera
View on GitHub
A lightweight, user-friendly data-plane for LLM training.
☆40Sep 10, 2025Updated 10 months ago
XuezheMax / gecko-llm
View on GitHub
Gecko Architecture
☆16Jan 13, 2026Updated 6 months ago
EricLBuehler / candle_graphs
View on GitHub
Graph model execution API for Candle
☆18Jul 27, 2025Updated 11 months ago
thinking-machines-lab / manifolds
View on GitHub
Supporting code for the blog post on modular manifolds.
☆125Sep 26, 2025Updated 9 months ago
Dao-AILab / sonic-moe
View on GitHub
Accelerating MoE with IO and Tile-aware Optimizations
☆731Jul 4, 2026Updated 2 weeks ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
tilde-research / one-layer-deeper
View on GitHub
☆29Updated this week
Dao-AILab / quack
View on GitHub
A Quirky Assortment of CuTe Kernels
☆1,060Updated this week
cherichy / tilecute
View on GitHub
☆32Jul 2, 2025Updated last year
nikhilvyas / SOAP_MUON
View on GitHub
Combining SOAP and MUON
☆22Feb 11, 2025Updated last year
JoeLi12345 / nGPT
View on GitHub
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆112Mar 7, 2025Updated last year
tile-ai / tvm
View on GitHub
Open deep learning compiler stack for cpu, gpu and specialized accelerators
☆19Jul 13, 2026Updated last week
ckkissane / sae-transfer
View on GitHub
Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"
☆13Jul 18, 2024Updated 2 years ago
kaistAI / GAP
View on GitHub
[ACL 2023] Gradient Ascent Post-training Enhances Language Model Generalization
☆29Sep 12, 2024Updated last year
vaguenebula / AlpacaDataReflect
View on GitHub
An experiment to see if chatgpt can improve the output of the stanford alpaca dataset
☆12Mar 29, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
tim-lawson / mlsae
View on GitHub
Multi-Layer Sparse Autoencoders (ICLR 2025)
☆30Feb 6, 2026Updated 5 months ago
shangshang-wang / Resa
View on GitHub
Resa: Transparent Reasoning Models via SAEs
☆50Sep 23, 2025Updated 9 months ago
smedegaard / hip_rs
View on GitHub
A rust wrapper for HIP
☆13Jun 10, 2025Updated last year
dayal-kalra / low-memory-adam
View on GitHub
☆14Mar 2, 2025Updated last year
wenhaochai / claude-plugins
View on GitHub
Personal Claude Code plugin marketplace
☆16Jul 4, 2026Updated 2 weeks ago
Infini-AI-Lab / vortex_torch
View on GitHub
Vortex: Programmable Sparse Attention for Agents as Algorithm Designers
☆67Jun 24, 2026Updated 3 weeks ago
kubernetes-bad / reward-composer
View on GitHub
Lego for GRPO
☆30May 27, 2025Updated last year
duykhuongnguyen / MAT-Steer
View on GitHub
☆21Aug 19, 2025Updated 11 months ago
tile-ai / tilelang-benchmark
View on GitHub
☆22Jun 10, 2026Updated last month
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
peytontolbert / simple-moe
View on GitHub
Simple MoE - Day 17 of 365 Days of Repos
☆20Jun 2, 2026Updated last month
chenwanqq / candle-llava
View on GitHub
implement llava using candle
☆15Jun 9, 2024Updated 2 years ago
ljt019 / candle-pipelines
View on GitHub
Candle Pipelines provides a simple, intuitive interface for Rust developers who want to work with Large Language Models locally, powered …
☆23Jan 5, 2026Updated 6 months ago
fla-org / flame
View on GitHub
🔥 A minimal training framework for scaling FLA models
☆403Apr 22, 2026Updated 2 months ago
Apsu / flue
View on GitHub
Fast, Lightweight, Unified Engine for Text2Image Diffusion Models
☆20Apr 13, 2025Updated last year
OliverSieberling / dynamic-conv1d
View on GitHub
Triton kernels for dynamic causal short convolutions.
☆24Jun 4, 2026Updated last month
infinigence / FUSCO
View on GitHub
High-performance distributed data shuffling (all-to-all) library for MoE training and inference
☆123Mar 7, 2026Updated 4 months ago