TorchJD / torchjd
Library for Jacobian descent with PyTorch. It enables optimization of neural networks with multiple losses (e.g. multi-task learning).
☆207Updated this week
Alternatives and similar repositories for torchjd:
Users that are interested in torchjd are comparing it to the libraries listed below
- Efficient optimizers☆169Updated this week
- ☆159Updated 2 months ago
- CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 seconds☆205Updated this week
- Official Implementation of "ADOPT: Modified Adam Can Converge with Any β2 with the Optimal Rate"☆417Updated 2 months ago
- supporting pytorch FSDP for optimizers☆76Updated 2 months ago
- The AdEMAMix Optimizer: Better, Faster, Older.☆178Updated 5 months ago
- ☆149Updated 6 months ago
- D-Adaptation for SGD, Adam and AdaGrad☆513Updated 3 weeks ago
- TensorHue is a Python library that allows you to visualize tensors right in your console, making understanding and debugging tensor conte…☆115Updated this week
- Replicating and dissecting the git-re-basin project in one-click-replication Colabs☆36Updated 2 years ago
- For optimization algorithm research and development.☆491Updated this week
- 🧱 Modula software package☆146Updated this week
- When it comes to optimizers, it's always better to be safe than sorry☆179Updated 3 weeks ago
- Muon optimizer: +~30% sample efficiency with <3% wallclock overhead☆254Updated last week
- optimizer & lr scheduler & loss function collections in PyTorch☆270Updated this week
- An implementation of PSGD Kron second-order optimizer for PyTorch☆83Updated last week
- Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam☆73Updated 6 months ago
- ☆163Updated 2 years ago
- Easy Hypernetworks in Pytorch and Jax☆97Updated 2 years ago
- ☆52Updated 4 months ago
- Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)☆122Updated last year
- ☆36Updated last year
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…☆120Updated 6 months ago
- A practical implementation of GradNorm, Gradient Normalization for Adaptive Loss Balancing, in Pytorch☆85Updated last year
- Implementation of the proposed minGRU in Pytorch☆279Updated last week
- Code for our NeurIPS 2022 paper☆366Updated 2 years ago
- [ICLR 2025] Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule☆131Updated last week
- Accelerated First Order Parallel Associative Scan☆171Updated 6 months ago
- Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch☆102Updated 2 months ago
- Scalable and Performant Data Loading☆219Updated this week