A zero-to-one guide on scaling modern transformers with n-dimensional parallelism.
☆122Dec 29, 2025Updated 3 months ago
Alternatives and similar repositories for JAXformer
Users that are interested in JAXformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Minimal (truly) muP implementation, consistent with TP4 and TP5 papers notation☆14Jan 2, 2026Updated 3 months ago
- Minimal, predictable, footgun-free config library.☆42Updated this week
- ☆52Mar 14, 2025Updated last year
- A collection of open-source large language model (LLM) implementations in JAX & Flax☆24Apr 1, 2025Updated last year
- ☆78Feb 18, 2026Updated 2 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆14Dec 11, 2018Updated 7 years ago
- Fast, simple, cryptographically strong random numbers in C++. Experimental.☆19Dec 12, 2013Updated 12 years ago
- llms can learn their own context compression via RL☆42Nov 26, 2025Updated 4 months ago
- Paper implementation☆52Apr 8, 2025Updated last year
- Generic library for neural collapse and several derivative works on the phenomenon.☆18Apr 14, 2025Updated last year
- Website for CSE 234, Winter 2025☆13Mar 24, 2025Updated last year
- Frechet inception distance (FID) evaluation in JAX☆14May 28, 2024Updated last year
- A JAX implementation of stochastic addition.☆14Aug 15, 2022Updated 3 years ago
- Tensor Parallelism with JAX + Shard Map☆11Sep 29, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆19Jul 24, 2025Updated 8 months ago
- ☆19Mar 16, 2025Updated last year
- Machine learning algorithms implements with jax for machine learning in production in large scale dataset.☆15Apr 8, 2026Updated last week
- Codebase for Linguistic Collapse: Neural Collapse in (Large) Language Models [NeurIPS 2024] [arXiv:2405.17767]☆18Apr 14, 2025Updated last year
- An efficient implementation of the NSA (Native Sparse Attention) kernel☆133Jun 24, 2025Updated 9 months ago
- ☆29Dec 15, 2025Updated 4 months ago
- A list of active Hack Club open source repositories with available issues on GitHub☆14Sep 10, 2025Updated 7 months ago
- 100 days of building GPU kernels!☆593Apr 27, 2025Updated 11 months ago
- Apple's Cut Cross Entropy☆30Jan 19, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Code and report for APMA136 Final Project☆19May 6, 2015Updated 10 years ago
- [ICML 2025] Repository for M3-JEPA: Multimodal Alignment via Multi-gate MoE based on the Joint-Predictive Embedding Architecture☆26Mar 13, 2026Updated last month
- Community Eventing and Scripting examples☆19Aug 11, 2025Updated 8 months ago
- Classes and methods for Geometric Deep Learning to support Substack, LinkedIn newsletters and tutorials☆25Mar 21, 2026Updated 3 weeks ago
- ☆44Updated this week
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning☆37May 18, 2025Updated 11 months ago
- Analyze and model weekly calendar distributions using latent components☆21Apr 6, 2026Updated last week
- Utility to use eleven lab's streaming to in the command line☆11Aug 8, 2023Updated 2 years ago
- A simple molecular dynamics code in python☆16Nov 14, 2025Updated 5 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- TPU pod commander is a package for managing and launching jobs on Google Cloud TPU pods.☆21Sep 24, 2025Updated 6 months ago
- A reliable leaderboard algorithm for machine learning competitions☆17May 19, 2015Updated 10 years ago
- ☆10Oct 22, 2024Updated last year
- Neural ODE Transformers (ICLR 2025)☆18Sep 6, 2025Updated 7 months ago
- ☆13Jul 12, 2024Updated last year
- Numerical relativity surrogate waveform in Jax☆19Aug 14, 2025Updated 8 months ago
- Python interface for COSMO.jl convex optimisation solver.☆17Sep 27, 2021Updated 4 years ago