facebookresearch / adaptive_scheduling
Experimental scripts for researching data adaptive learning rate scheduling.
☆23Updated last year
Related projects ⓘ
Alternatives and complementary repositories for adaptive_scheduling
- Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️☆52Updated last year
- Implementation of a holodeck, written in Pytorch☆17Updated last year
- code for paper "Accessing higher dimensions for unsupervised word translation"☆21Updated last year
- Source-to-Source Debuggable Derivatives in Pure Python☆14Updated 10 months ago
- Official code for the paper "Attention as a Hypernetwork"☆23Updated 5 months ago
- ☆29Updated 2 years ago
- CUDA implementation of autoregressive linear attention, with all the latest research findings☆43Updated last year
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆16Updated last year
- DPO, but faster 🚀☆23Updated 3 weeks ago
- Code repository for the public reproduction of the language modelling experiments on "MatFormer: Nested Transformer for Elastic Inference…☆18Updated last year
- Implementation of an Attention layer where each head can attend to more than just one token, using coordinate descent to pick topk☆46Updated last year
- Code for "Don't trust your eyes: on the (un)reliability of feature visualizations"☆31Updated last year
- Description and applications of OpenAI's paper about DALL-E (2021) and implementation of other (CLIP-guided) zero-shot text-to-image gene…☆30Updated 2 years ago
- Utilities for Training Very Large Models☆56Updated last month
- Implementation of the proposed Spline-Based Transformer from Disney Research☆77Updated 2 weeks ago
- FlexAttention w/ FlashAttention3 Support☆27Updated last month
- This is a PyTorch implementation of the paperViP A Differentially Private Foundation Model for Computer Vision☆37Updated last year
- A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.☆30Updated 4 months ago
- Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant …☆14Updated 8 months ago
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆19Updated 3 months ago
- A scalable implementation of diffusion and flow-matching with XGBoost models, applied to calorimeter data.☆17Updated 3 weeks ago
- ☆31Updated 2 months ago
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆52Updated last month
- Using FlexAttention to compute attention with different masking patterns☆40Updated 2 months ago
- Utilities for PyTorch distributed☆23Updated last year
- Triton Implementation of HyperAttention Algorithm☆46Updated 11 months ago
- Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"☆53Updated this week
- SSL Video Representation Learning project☆10Updated last year
- FID computation in Jax/Flax.☆24Updated 4 months ago
- Hacks for PyTorch☆17Updated last year