MoE training for Me and You and maybe other people
☆361Feb 7, 2026Updated 3 weeks ago
Alternatives and similar repositories for nmoe
Users that are interested in nmoe are comparing it to the libraries listed below
Sorting:
- Shaping capabilities with token-level pretraining data filtering☆83Jan 28, 2026Updated last month
- Latent Large Language Models☆19Aug 24, 2024Updated last year
- Ludic – an LLM-RL library for the era of experience☆60Jan 9, 2026Updated last month
- https://hf.co/hexgrad/Kokoro-82M☆14Jan 14, 2026Updated last month
- Entropy Based Sampling and Parallel CoT Decoding☆17Oct 9, 2024Updated last year
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆858Feb 20, 2026Updated last week
- Accelerating MoE with IO and Tile-aware Optimizations☆591Updated this week
- ☆17May 8, 2024Updated last year
- AI eXplainable Inference & Search. Open Sourcing on-premise, ultra-fast latency intelligence to all.☆36Feb 28, 2025Updated last year
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆201Jun 1, 2025Updated 8 months ago
- This repository contains the Parasol processor, which enables next-generation privacy preserving applications. Users can run arbitrary co…☆11Updated this week
- Async RL Training at Scale☆1,096Updated this week
- Entropy Based Sampling and Parallel CoT Decoding☆3,434Nov 13, 2024Updated last year
- Official code release for "SuperBPE: Space Travel for Language Models"☆89Jan 9, 2026Updated last month
- Triton-based Symmetric Memory operators and examples☆85Jan 15, 2026Updated last month
- A minimal implementation of Drifting Models for 2D toy data. Unlike diffusion/flow models that iterate at inference, drifting models evo…☆63Feb 13, 2026Updated 2 weeks ago
- H-Net Dynamic Hierarchical Architecture☆81Sep 11, 2025Updated 5 months ago
- ☆10Nov 6, 2024Updated last year
- Synthetic Data Generation with Execution-Based Verification and Grounding for LLM Training.☆19Feb 7, 2025Updated last year
- Minimalistic large language model 3D-parallelism training☆2,569Feb 19, 2026Updated last week
- A graph visualization of attention☆57May 20, 2025Updated 9 months ago
- A simple, performant and scalable Jax LLM!☆2,148Updated this week
- A lattice QCD library.☆16Feb 10, 2026Updated 2 weeks ago
- Interactive Article Explaining Isomap☆44Jan 6, 2026Updated last month
- ☆28Jan 17, 2025Updated last year
- Benchmarking Goal-Oriented Software Engineering☆114Jan 7, 2026Updated last month
- Evaluating the Mamba architecture on the Othello game☆49Apr 25, 2024Updated last year
- ☆29Oct 24, 2025Updated 4 months ago
- ☆156Updated this week
- mHC kernels implemented in CUDA☆252Jan 14, 2026Updated last month
- Evaluate coding agents. Like a sniff test, but it's a benchmark.☆26Feb 16, 2026Updated last week
- watch your screen while doing sales and fill your crm automatically☆17Jun 2, 2024Updated last year
- This repository contains code for the paper: S Bergsma, T Zeyl, JR Anaraki, L Guo, C2FAR: Coarse-to-Fine Autoregressive Networks for Prec…☆13Dec 7, 2023Updated 2 years ago
- It's a baby compiler. (Lean btw.)☆16May 19, 2025Updated 9 months ago
- Gauge Link Utility (GLU) is a lattice field theory library.☆15May 29, 2025Updated 9 months ago
- ☆16Mar 27, 2023Updated 2 years ago
- Support Continual pre-training & Instruction Tuning forked from llama-recipes☆34Feb 17, 2024Updated 2 years ago
- ☆59Nov 18, 2025Updated 3 months ago
- ☆93Jul 5, 2024Updated last year