ReiherGroup / CoRe_optimizerLinks
Continual Resilient (CoRe) Optimizer for PyTorch
☆10Updated 11 months ago
Alternatives and similar repositories for CoRe_optimizer
Users that are interested in CoRe_optimizer are comparing it to the libraries listed below
Sorting:
- A scalable implementation of diffusion and flow-matching with XGBoost models, applied to calorimeter data.☆18Updated 7 months ago
- Implementation of papers in 101 lines of code.☆18Updated last year
- Unofficially Implements https://arxiv.org/abs/2112.05682 to get Linear Memory Cost on Attention for PyTorch☆12Updated 3 years ago
- PyTorch implementation for "Long Horizon Temperature Scaling", ICML 2023☆20Updated 2 years ago
- FLOPS counter for all your GPU benchmarking needs☆13Updated 10 months ago
- Utilities for PyTorch distributed☆24Updated 3 months ago
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆50Updated 3 years ago
- ☆16Updated 2 years ago
- Code for "Journey to the BAOAB-limit: finding effective MCMC samplers for score-based models". See more at https://ajayj.com/journey.☆12Updated 2 years ago
- Explorations into adversarial losses on top of autoregressive loss for language modeling☆36Updated 3 months ago
- A dashboard for exploring timm learning rate schedulers☆19Updated 6 months ago
- ☆44Updated last year
- Implementation of a Light Recurrent Unit in Pytorch☆47Updated 8 months ago
- Source-to-Source Debuggable Derivatives in Pure Python☆15Updated last year
- JAX implementation of Learning to learn by gradient descent by gradient descent☆27Updated 7 months ago
- ☆34Updated 8 months ago
- ☆15Updated 6 months ago
- TriNet: stabilizing self-supervised learning from complete or slow collapse on ASR.☆26Updated 2 years ago
- Describe the format of image/text datasets☆11Updated 3 years ago
- Generative Equilibrium Transformer☆18Updated last year
- Modified Score-Entropy-Discrete-Diffusion to do a character level ml model and integrate with Oxen☆14Updated last year
- Code for the paper "Cottention: Linear Transformers With Cosine Attention"☆17Updated 7 months ago
- Implementation of SoundtStream from the paper: "SoundStream: An End-to-End Neural Audio Codec"☆12Updated 4 months ago
- Memory-efficient optimum einsum using opt_einsum planning and PyTorch kernels.☆15Updated 2 years ago
- Implementation of an Attention layer where each head can attend to more than just one token, using coordinate descent to pick topk☆46Updated last year
- ☆8Updated last year
- Implementation of Insertion-deletion Denoising Diffusion Probabilistic Models☆30Updated 3 years ago
- A JAX nn library☆21Updated 3 months ago
- Local Attention - Flax module for Jax☆22Updated 4 years ago
- ☆32Updated last year