OATML / RHO-Loss
☆182Updated last year
Related projects: ⓘ
- Implementation of Estimating Training Data Influence by Tracing Gradient Descent (NeurIPS 2020)☆214Updated 2 years ago
- FFCV-SSL Fast Forward Computer Vision for Self-Supervised Learning.☆199Updated last year
- Framework code with wandb, checkpointing, logging, configs, experimental protocols. Useful for fine-tuning models or training from scratc…☆146Updated last year
- Train ImageNet *fast* in 500 lines of code with FFCV☆135Updated 4 months ago
- Named tensors with first-class dimensions for PyTorch☆321Updated last year
- A fast, effective data attribution method for neural networks in PyTorch☆170Updated this week
- Implementation of the Adan (ADAptive Nesterov momentum algorithm) Optimizer in Pytorch☆247Updated 2 years ago
- Code release for "Dropout Reduces Underfitting"☆311Updated last year
- Sequence modeling with Mega.☆296Updated last year
- Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena☆203Updated last year
- ☆163Updated last year
- An active learning library for Pytorch based on Lightning-Fabric.☆78Updated 4 months ago
- Unofficial JAX implementations of deep learning research papers☆150Updated 2 years ago
- Learning to Initialize Neural Networks for Stable and Efficient Training☆134Updated 2 years ago
- This repository contains the results for the paper: "Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers"☆176Updated 3 years ago
- ☆133Updated 10 months ago
- TF/Keras code for DiffStride, a pooling layer with learnable strides.☆123Updated 2 years ago
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆204Updated last year
- Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint☆342Updated 5 months ago
- Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions☆257Updated 10 months ago
- Simple and efficient RevNet-Library for PyTorch with XLA and DeepSpeed support and parameter offload☆123Updated 2 years ago
- ☆92Updated 4 months ago
- ☆101Updated last year
- JAX Synergistic Memory Inspector☆161Updated 2 months ago
- Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch☆94Updated last year
- Optimal Transport Dataset Distance☆151Updated 2 years ago
- A centralized place for deep thinking code and experiments☆73Updated last year
- Convert scikit-learn models to PyTorch modules☆157Updated 4 months ago
- An alternative to convolution in neural networks☆248Updated 5 months ago
- Helps you write algorithms in PyTorch that adapt to the available (CUDA) memory☆419Updated 3 weeks ago