MoE training for Me and You and maybe other people
☆392Mar 15, 2026Updated 3 months ago
Alternatives and similar repositories for nmoe
Users that are interested in nmoe are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Ludic – an LLM-RL library for the era of experience☆66Jan 9, 2026Updated 5 months ago
- Latent Large Language Models☆19Aug 24, 2024Updated last year
- Shaping capabilities with token-level pretraining data filtering☆94Jan 28, 2026Updated 5 months ago
- Accelerating MoE with IO and Tile-aware Optimizations☆720Updated this week
- Agentic RL Training at Scale☆1,533Jun 24, 2026Updated last week
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Benchmarking Goal-Oriented Software Engineering☆175Updated this week
- Entropy Based Sampling and Parallel CoT Decoding☆17Oct 9, 2024Updated last year
- ☆120Apr 7, 2026Updated 2 months ago
- Karpathy's llama2.c transpiled to MLX for Apple Silicon☆14Dec 28, 2023Updated 2 years ago
- Triton-based Symmetric Memory operators and examples☆103May 15, 2026Updated last month
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆1,311Jun 22, 2026Updated last week
- ☆17May 8, 2024Updated 2 years ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆205Jun 1, 2025Updated last year
- PyTorch-native post-training at scale☆688Jun 23, 2026Updated last week
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆29Jan 17, 2025Updated last year
- Minimalistic large language model 3D-parallelism training☆2,729May 26, 2026Updated last month
- H-Net Dynamic Hierarchical Architecture☆81Sep 11, 2025Updated 9 months ago
- Entropy Based Sampling and Parallel CoT Decoding☆3,433Nov 13, 2024Updated last year
- supporting pytorch FSDP for optimizers☆84Dec 8, 2024Updated last year
- A graph visualization of attention☆56May 20, 2025Updated last year
- AI eXplainable Inference & Search. Open Sourcing on-premise, ultra-fast latency intelligence to all.☆37Feb 28, 2025Updated last year
- A simple, performant and scalable Jax LLM!☆2,338Updated this week
- ☆14Apr 16, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆93Jul 5, 2024Updated last year
- EleutherAI ML Performance reading group repository (slides, meeting recordings, annotated papers)☆35Mar 20, 2026Updated 3 months ago
- Educational WIP☆72Feb 16, 2026Updated 4 months ago
- Repository for "Training Language Models To Explain Their Own Computations"☆22Dec 22, 2025Updated 6 months ago
- ☆129Jun 11, 2025Updated last year
- Expand -> Retrieve -> Rerank - simple method with strong results on BRIGHT benchmark☆22Aug 22, 2025Updated 10 months ago
- Our library for RL environments + evals☆4,233Updated this week
- Open Character Training☆89Apr 4, 2026Updated 2 months ago
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆32Mar 1, 2025Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Official code release for "SuperBPE: Space Travel for Language Models"☆93May 28, 2026Updated last month
- ☆10Nov 6, 2024Updated last year
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- Letting Claude Code develop his own MCP tools :)☆121Mar 8, 2025Updated last year
- Triton Implementation of HyperAttention Algorithm☆48Dec 11, 2023Updated 2 years ago
- mHC kernels implemented in CUDA☆267Mar 9, 2026Updated 3 months ago
- Manage ML configuration with pydantic☆16Mar 18, 2026Updated 3 months ago