ReiherGroup / CoRe_optimizer
Continual Resilient (CoRe) Optimizer for PyTorch
☆11Updated 9 months ago
Alternatives and similar repositories for CoRe_optimizer:
Users that are interested in CoRe_optimizer are comparing it to the libraries listed below
- A scalable implementation of diffusion and flow-matching with XGBoost models, applied to calorimeter data.☆17Updated 4 months ago
- PyTorch implementation for "Long Horizon Temperature Scaling", ICML 2023☆20Updated last year
- Unofficially Implements https://arxiv.org/abs/2112.05682 to get Linear Memory Cost on Attention for PyTorch☆12Updated 3 years ago
- Implementation of Insertion-deletion Denoising Diffusion Probabilistic Models☆30Updated 2 years ago
- Code for "Journey to the BAOAB-limit: finding effective MCMC samplers for score-based models". See more at https://ajayj.com/journey.☆12Updated 2 years ago
- Implementation of papers in 101 lines of code.☆18Updated last year
- A dashboard for exploring timm learning rate schedulers☆19Updated 3 months ago
- Implementation of a Light Recurrent Unit in Pytorch☆47Updated 5 months ago
- Utilities for PyTorch distributed☆23Updated 3 weeks ago
- JAX implementation of Learning to learn by gradient descent by gradient descent☆27Updated 5 months ago
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆49Updated 2 years ago
- Implementation of Spectral State Space Models☆16Updated last year
- A JAX nn library☆21Updated 3 weeks ago
- Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"☆57Updated last year
- Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorch☆23Updated 2 months ago
- Source-to-Source Debuggable Derivatives in Pure Python☆15Updated last year
- Describe the format of image/text datasets☆11Updated 2 years ago
- Local Attention - Flax module for Jax☆20Updated 3 years ago
- ☆8Updated last year
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32Updated 9 months ago
- [NeurIPS 2024, spotlight] Multivariate Learned Adaptive Noise for Diffusion Models☆18Updated 3 months ago
- Implementation for ACProp ( Momentum centering and asynchronous update for adaptive gradient methdos, NeurIPS 2021)☆15Updated 3 years ago
- ☆33Updated 6 months ago
- Repository for the PopulAtion Parameter Averaging (PAPA) paper☆26Updated 11 months ago
- [NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)☆34Updated last year
- Official code for the paper: "Metadata Archaeology"☆19Updated last year
- Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)☆24Updated 9 months ago