naver-ai / burnLinks
Official Pytorch Implementation of Unsupervised Representation Learning for Binary Networks by Joint Classifier Training (CVPR 2022)
☆11Updated 3 years ago
Alternatives and similar repositories for burn
Users that are interested in burn are comparing it to the libraries listed below
Sorting:
- ☆18Updated 2 years ago
- Learning Features with Parameter-Free Layers, ICLR 2022☆84Updated 2 years ago
- Official repository for Automated Learning Rate Scheduler for Large-Batch Training (8th ICML Workshop on AutoML)☆40Updated 3 years ago
- Official repository for Fourier model that can generate periodic signals☆10Updated 3 years ago
- Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning☆12Updated last week
- Code Implementation for "NASH: A Simple Unified Framework of Structured Pruning for Accelerating Encoder-Decoder Language Models" (EMNLP …☆16Updated last year
- ☆19Updated 2 years ago
- [ICLR 2023] RC-MAE☆53Updated last year
- ☆14Updated 3 years ago
- Experiments for "A Closer Look at In-Context Learning under Distribution Shifts"☆19Updated 2 years ago
- Official pytorch implementation for GeNAS: Neural Architecture Search with Better Generalization☆16Updated 2 years ago
- [ECCV2024][ICCV2023] Official PyTorch implementation of SeiT++ and SeiT☆55Updated last year
- Official repository of "Distort, Distract, Decode: Instruction-Tuned Model Can Refine its Response from Noisy Instructions", ICLR 2024 Sp…☆21Updated last year
- A variant of Transformer-XL where the memory is updated not with a queue, but with attention☆49Updated 5 years ago
- [ICLR 2021] "Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning" by Tianlong Chen*, Zhenyu Zhang*, Sijia Liu, S…☆25Updated 3 years ago
- ImageNet-12k subset of ImageNet-21k (fall11)☆21Updated 2 years ago
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆49Updated 3 years ago
- ☆21Updated 2 years ago
- The accompanying code for "Memory-efficient Transformers via Top-k Attention" (Ankit Gupta, Guy Dar, Shaya Goodman, David Ciprut, Jonatha…☆69Updated 3 years ago
- DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training (ICLR 2023)☆31Updated 2 years ago
- Official Pytorch implementation of Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference (ICLR …☆53Updated 2 years ago
- Code for the paper "What Makes Better Augmentation Strategies? Augment Difficult but Not too Different" (ICLR 22)☆12Updated 2 years ago
- Latest Weight Averaging (NeurIPS HITY 2022)☆31Updated 2 years ago
- Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch☆39Updated 3 years ago
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆16Updated 2 months ago
- [NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling☆40Updated last year
- ☆45Updated last year
- [COLM 2024] Early Weight Averaging meets High Learning Rates for LLM Pre-training☆17Updated 10 months ago
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆81Updated 2 years ago
- (ACL-IJCNLP 2021) Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models.☆21Updated 3 years ago