A zero-to-one guide on scaling modern transformers with n-dimensional parallelism.
☆123Dec 29, 2025Updated 4 months ago
Alternatives and similar repositories for JAXformer
Users that are interested in JAXformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14May 26, 2025Updated 11 months ago
- Tidy autoregressive inference in JAX☆15Sep 1, 2025Updated 8 months ago
- Minimal (truly) muP implementation, consistent with TP4 and TP5 papers notation☆14Jan 2, 2026Updated 4 months ago
- Custom triton kernels for training Karpathy's nanoGPT.☆19Oct 21, 2024Updated last year
- An implementation of the Llama architecture, to instruct and delight☆21May 31, 2025Updated 11 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆577Jul 11, 2024Updated last year
- llms can learn their own context compression via RL☆42Nov 26, 2025Updated 5 months ago
- Paper implementation☆52Apr 8, 2025Updated last year
- Minimal but scalable implementation of large language models in JAX☆35Nov 28, 2025Updated 5 months ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆203Jun 1, 2025Updated 11 months ago
- a distributed end-to-end image classification system using kubernetes☆14Dec 31, 2024Updated last year
- A JAX implementation of stochastic addition.☆14Aug 15, 2022Updated 3 years ago
- Reinforcement Learning example in Nim, playing tic tac toe. Based off original C version from the great Antirez☆15Apr 2, 2025Updated last year
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆19Jul 24, 2025Updated 9 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆19Mar 16, 2025Updated last year
- Codebase for Linguistic Collapse: Neural Collapse in (Large) Language Models [NeurIPS 2024] [arXiv:2405.17767]☆18Apr 14, 2025Updated last year
- An efficient implementation of the NSA (Native Sparse Attention) kernel☆133Jun 24, 2025Updated 10 months ago
- Contains JAX implementation of algorithms for inverse reinforcement learning☆78Aug 18, 2024Updated last year
- Optimize the construction of earthquake-resistant buildings☆10Jul 7, 2024Updated last year
- ☆29Dec 15, 2025Updated 4 months ago
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆17Apr 22, 2025Updated last year
- Apple's Cut Cross Entropy☆31Jan 19, 2025Updated last year
- Hack Club Bank CLI☆10Jul 25, 2022Updated 3 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Community Eventing and Scripting examples☆19Aug 11, 2025Updated 8 months ago
- ☆44Updated this week
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning☆37May 18, 2025Updated 11 months ago
- ☆10Apr 21, 2025Updated last year
- ☆10Feb 27, 2026Updated 2 months ago
- A simple molecular dynamics code in python☆16Nov 14, 2025Updated 5 months ago
- Utility to use eleven lab's streaming to in the command line☆11Aug 8, 2023Updated 2 years ago
- TPU pod commander is a package for managing and launching jobs on Google Cloud TPU pods.☆20Sep 24, 2025Updated 7 months ago
- A reliable leaderboard algorithm for machine learning competitions☆17May 19, 2015Updated 10 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Classes and methods for Geometric Deep Learning to support Substack, LinkedIn newsletters and tutorials☆26Apr 30, 2026Updated last week
- Neural ODE Transformers (ICLR 2025)☆19Sep 6, 2025Updated 8 months ago
- An AI-driven resume manager that takes a human-written verbose resume and crafts it to fit a specific job role☆14Jan 29, 2025Updated last year
- Exploitability calculation for imperfect-information game benchmarks☆34Apr 5, 2025Updated last year
- ☆10May 11, 2024Updated last year
- Bayesian Deep Ensembles via MILE: easy to use, scikit-learn compatible and fast (JAX powered)☆42Apr 1, 2026Updated last month
- ☆14Jul 12, 2024Updated last year