facebookresearch / adaptive_schedulingLinks

Experimental scripts for researching data adaptive learning rate scheduling.

☆23

Alternatives and similar repositories for adaptive_scheduling

Users that are interested in adaptive_scheduling are comparing it to the libraries listed below

Sorting:

eth-easl / fmengine
Utilities for Training Very Large Models
☆58Updated 10 months ago
crypdick / timm-lr-scheduler-explorer
A dashboard for exploring timm learning rate schedulers
☆19Updated 8 months ago
facebookresearch / coocmap
code for paper "Accessing higher dimensions for unsupervised word translation"
☆21Updated 2 years ago
zaydzuhri / flame
Fork of Flame repo for training of some new stuff in development
☆14Updated 2 weeks ago
simonsanvil / DALL-E-Explained
Description and applications of OpenAI's paper about DALL-E (2021) and implementation of other (CLIP-guided) zero-shot text-to-image gene…
☆33Updated 2 years ago
RobertCsordas / linear_layer_as_attention
The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …
☆16Updated last month
kiddyboots216 / lottery-ticket-adaptation
Lottery Ticket Adaptation
☆39Updated 8 months ago
smonsays / hypernetwork-attention
Official code for the paper "Attention as a Hypernetwork"
☆40Updated last year
shreyansh26 / Attention-Mask-Patterns
Using FlexAttention to compute attention with different masking patterns
☆44Updated 10 months ago
amirzandieh / HyperAttention
Triton Implementation of HyperAttention Algorithm
☆48Updated last year
lucidrains / autoregressive-linear-attention-cuda
CUDA implementation of autoregressive linear attention, with all the latest research findings
☆44Updated 2 years ago
lucidrains / sinkhorn-router-pytorch
Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise
☆37Updated 11 months ago
huggingface / pixparse
Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data
☆21Updated last year
gregorbachmann / scaling_mlps
☆51Updated last year
GenRobo / MatMamba
Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"
☆60Updated 8 months ago
lucidrains / einops-exts
Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️
☆55Updated 2 years ago
lucidrains / transformer-lm-gan
Explorations into adversarial losses on top of autoregressive loss for language modeling
☆37Updated 5 months ago
srush / tangent
Source-to-Source Debuggable Derivatives in Pure Python
☆15Updated last year
OpenNLPLab / HGRN2
HGRN2: Gated Linear RNNs with State Expansion
☆55Updated 11 months ago
facebookresearch / ViP-MAE
This is a PyTorch implementation of the paperViP A Differentially Private Foundation Model for Computer Vision
☆36Updated 2 years ago
lucidrains / self-reasoning-tokens-pytorch
Exploration into the proposed "Self Reasoning Tokens" by Felipe Bonetto
☆56Updated last year
cloneofsimo / zeroshampoo
☆34Updated 10 months ago
prateeky2806 / ComPEFT
☆26Updated last year
lucidrains / taylor-series-linear-attention
Explorations into the recently proposed Taylor Series Linear Attention
☆100Updated 11 months ago
BlinkDL / LinearAttentionArena
Here we will test various linear attention designs.
☆62Updated last year
huggingface / peft-pytorch-conference
Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…
☆14Updated last year
graphcore-research / jax-scalify
JAX Scalify: end-to-end scaled arithmetics
☆16Updated 9 months ago
facebookresearch / lss_eval
This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…
☆31Updated last year
EleutherAI / mdl
Minimum Description Length probing for neural network representations
☆18Updated 6 months ago
lucidrains / tableformer-pytorch
Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch
☆39Updated 3 years ago