☆180Apr 24, 2026Updated this week
Alternatives and similar repositories for Emerging-Optimizers
Users that are interested in Emerging-Optimizers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 4-bit Shampoo for Memory-Efficient Network Training (NeurIPS 2024)☆13Feb 13, 2025Updated last year
- Official implementation for the paper "Controlled Sparsity via Constrained Optimization"☆12Aug 10, 2022Updated 3 years ago
- Combining SOAP and MUON☆20Feb 11, 2025Updated last year
- Physics, mathematics and computer science notes, mostly in Mandarin Chinese, some in English.☆12Updated this week
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆19Jul 24, 2025Updated 9 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆62Apr 8, 2026Updated 3 weeks ago
- Physics of Language Models: Part 4.2, Canon Layers at Scale where Synthetic Pretraining Resonates in Reality☆339Jan 5, 2026Updated 3 months ago
- Efficient Long-context Language Model Training by Core Attention Disaggregation☆98Apr 7, 2026Updated 3 weeks ago
- Implementation of 2-simplicial attention proposed by Clift et al. (2019) and the recent attempt to make practical in Fast and Simplex, Ro…☆47Sep 2, 2025Updated 7 months ago
- Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.☆186Mar 17, 2026Updated last month
- Code for the paper: Why Transformers Need Adam: A Hessian Perspective☆65Mar 11, 2025Updated last year
- Minimal pretraining script for language modeling in PyTorch. Supporting torch compilation and DDP. It includes a model implementation and…☆47Mar 16, 2026Updated last month
- Dion optimizer algorithm☆467Updated this week
- VeLO optimizer in PyTorch☆20Feb 6, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A library for exporting models including NeMo and Hugging Face to optimized inference backends, and deploying them for efficient querying☆33Apr 23, 2026Updated last week
- Efficient optimizers☆314Updated this week
- ☆24Mar 1, 2022Updated 4 years ago
- ☆265Dec 2, 2024Updated last year
- 💻 Terminal-Agent with Human-in-the-Loop Learning☆39Jan 16, 2026Updated 3 months ago
- ☆29Apr 6, 2026Updated 3 weeks ago
- Materials for HSE course "Applied Statistics in Machine Learning" taught during 2018.☆21Mar 21, 2024Updated 2 years ago
- ☆28Mar 30, 2026Updated last month
- [NeurIPS 2023] and [ICLR 2024] for robustness certification.☆10Nov 30, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- All-in-one benchmarking platform for evaluating LLM.☆15Nov 12, 2025Updated 5 months ago
- The official codebase for "Experiential Reinforcement Learning" - https://arxiv.org/pdf/2602.13949v1☆68Apr 8, 2026Updated 3 weeks ago
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆81Aug 30, 2023Updated 2 years ago
- Minimal but scalable implementation of large language models in JAX☆35Nov 28, 2025Updated 5 months ago
- Accelerated Bregman Proximal Gradient Methods☆29Jun 12, 2023Updated 2 years ago
- ☆27Mar 15, 2023Updated 3 years ago
- implementation of https://arxiv.org/pdf/2312.09299☆21Jul 3, 2024Updated last year
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆704Jan 26, 2026Updated 3 months ago
- This repository regroups learning ressources about performance estimation problems☆15Mar 18, 2026Updated last month
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Quantized LLM training in pure CUDA/C++.☆243Mar 6, 2026Updated last month
- Flash-Muon: An Efficient Implementation of Muon Optimizer☆248Jun 15, 2025Updated 10 months ago
- My fork os allen AI's OLMo for educational purposes.☆28Dec 5, 2024Updated last year
- supporting pytorch FSDP for optimizers☆84Dec 8, 2024Updated last year
- ☆23Oct 10, 2025Updated 6 months ago
- A repo based on XiLin Li's PSGD repo that extends some of the experiments.☆14Oct 7, 2024Updated last year
- Libraries for efficient and scalable group-structured dataset pipelines.☆25Jun 18, 2025Updated 10 months ago