PyTorch interface for TrueGrad Optimizers
☆43Aug 8, 2023Updated 2 years ago
Alternatives and similar repositories for TrueGrad
Users that are interested in TrueGrad are comparing it to the libraries listed below
Sorting:
- Parallel Associative Scan for Language Models☆18Jan 8, 2024Updated 2 years ago
- ☆18Aug 24, 2024Updated last year
- Test pytorch code with minimal computational overhead☆26Jun 8, 2023Updated 2 years ago
- Automatically take good care of your preemptible TPUs☆37May 15, 2023Updated 2 years ago
- ☆13Nov 27, 2025Updated 3 months ago
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Jun 21, 2023Updated 2 years ago
- HomebrewNLP in JAX flavour for maintable TPU-Training☆51Jan 20, 2024Updated 2 years ago
- Code repo for the paper "Semantic Correspondence via 2D-3D-2D Cycle"☆12Jan 28, 2021Updated 5 years ago
- ☆13Jan 15, 2025Updated last year
- Awesome Triton Resources☆39Apr 27, 2025Updated 10 months ago
- A toolkit for scaling law research ⚖☆57Jan 27, 2025Updated last year
- ☆33Nov 4, 2024Updated last year
- ☆14Jan 10, 2024Updated 2 years ago
- LoRA for arbitrary JAX models and functions☆145Feb 26, 2024Updated 2 years ago
- Implementation of papers in 101 lines of code.☆18Nov 12, 2023Updated 2 years ago
- reproduces experiments from "Grounding inductive biases in natural images: invariance stems from variations in data"☆17Sep 25, 2024Updated last year
- Training hybrid models for dummies.☆29Nov 1, 2025Updated 3 months ago
- This is an implementation of Image2StyleGAN embedding algorithm and various experiments using StyleGAN2-ADA as backbone.☆17Sep 2, 2021Updated 4 years ago
- AGaLiTe: Approximate Gated Linear Transformers for Online Reinforcement Learning (Published in TMLR)☆23Oct 15, 2024Updated last year
- ☆19Dec 4, 2025Updated 2 months ago
- High performance pytorch modules☆17Jan 14, 2023Updated 3 years ago
- ☆55Nov 5, 2024Updated last year
- Code for recreating the HoS benchmark of VISOR☆22Jul 2, 2023Updated 2 years ago
- Portfolio REgret for Confidence SEquences☆20Jan 6, 2026Updated last month
- ☆23Jun 18, 2024Updated last year
- Pure C implementation of e3nn☆24Mar 17, 2025Updated 11 months ago
- Implementation of Diffusion Transformers and Rectified Flow in Jax☆27Jul 9, 2024Updated last year
- research impl of Native Sparse Attention (2502.11089)☆63Feb 19, 2025Updated last year
- [ACL 2023] Gradient Ascent Post-training Enhances Language Model Generalization☆29Sep 12, 2024Updated last year
- JAX implementation of Learning to learn by gradient descent by gradient descent☆28Aug 5, 2025Updated 6 months ago
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆56Feb 28, 2023Updated 3 years ago
- WIP☆94Aug 13, 2024Updated last year
- [3DV 2025] Official Implementation of paper "PIR: Photometric Inverse Rendering with Shading Cues Modeling and Surface Reflectance Regula…☆29Mar 18, 2025Updated 11 months ago
- ☆29Jul 9, 2024Updated last year
- A collection of optimizers, some arcane others well known, for Flax.☆29Aug 6, 2021Updated 4 years ago
- Lightweight package that tracks and summarizes code changes using LLMs (Large Language Models)☆34Feb 27, 2025Updated last year
- AutoInit: Analytic Signal-Preserving Weight Initialization for Neural Networks☆29Oct 26, 2022Updated 3 years ago
- MIO: A Foundation Model on Multimodal Tokens☆34Dec 13, 2024Updated last year
- An annotated implementation of the Hyena Hierarchy paper☆34May 28, 2023Updated 2 years ago