Supporting code for the blog post on modular manifolds.
☆121Sep 26, 2025Updated 7 months ago
Alternatives and similar repositories for manifolds
Users that are interested in manifolds are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for implementing central flows☆44Sep 5, 2025Updated 7 months ago
- Code for the paper "Function-Space Learning Rates"☆25Jun 3, 2025Updated 10 months ago
- Code for "What really matters in matrix-whitening optimizers?"☆23Oct 31, 2025Updated 5 months ago
- ☆19Dec 4, 2025Updated 4 months ago
- ☆33Oct 4, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Schedule free optimiser implemented in JAX using Optimistix☆15May 29, 2024Updated last year
- ☆36Feb 26, 2024Updated 2 years ago
- Parallel Associative Scan for Language Models☆18Jan 8, 2024Updated 2 years ago
- ☆124May 28, 2024Updated last year
- Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)☆25Jun 6, 2024Updated last year
- 🧱 Modula software package☆327Aug 18, 2025Updated 8 months ago
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- train with kittens!☆64Oct 25, 2024Updated last year
- diffusers with search engine☆12Jan 13, 2026Updated 3 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- [Poster; ICLR 2026] [Oral; Neurips OPT2024] μLO: Compute-Efficient Meta-Generalization of Learned Optimizers☆16Apr 15, 2026Updated 2 weeks ago
- ☆69Mar 21, 2025Updated last year
- ☆26Feb 20, 2026Updated 2 months ago
- [ACL 2025] Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models☆36Nov 4, 2025Updated 5 months ago
- A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.☆75Aug 2, 2024Updated last year
- Accelerated First Order Parallel Associative Scan☆197Jan 7, 2026Updated 3 months ago
- About Code release for "FlashBias: Fast Computation of Attention with Bias" (NeurIPS 2025), https://arxiv.org/abs/2505.12044☆28Nov 17, 2025Updated 5 months ago
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"☆110Oct 11, 2025Updated 6 months ago
- JAX Scalify: end-to-end scaled arithmetics☆18Oct 30, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Train toy models using multi-token prediction objective☆14Apr 18, 2026Updated last week
- Grokking on modular arithmetic in less than 150 epochs in MLX☆15Oct 24, 2024Updated last year
- Experiment of using Tangent to autodiff triton☆82Jan 22, 2024Updated 2 years ago
- ☆20May 30, 2024Updated last year
- A library for unit scaling in PyTorch☆133Jul 11, 2025Updated 9 months ago
- Unofficial implementation of paper : Exploring the Space of Key-Value-Query Models with Intention☆12May 24, 2023Updated 2 years ago
- Odysseus: Playground of LLM Sequence Parallelism☆78Jun 17, 2024Updated last year
- ☆22Dec 15, 2023Updated 2 years ago
- Triton-based implementation of Sparse Mixture of Experts.☆273Oct 3, 2025Updated 6 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Combining SOAP and MUON☆20Feb 11, 2025Updated last year
- ☆45Nov 1, 2025Updated 5 months ago
- ☆13Jun 3, 2024Updated last year
- ☆15Dec 5, 2019Updated 6 years ago
- ☆114Aug 26, 2024Updated last year
- Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"☆18Mar 15, 2024Updated 2 years ago
- Official Project Page for HLA: Higher-order Linear Attention (https://arxiv.org/abs/2510.27258)☆48Jan 6, 2026Updated 3 months ago