princeton-nlp / DinkyTrain
Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration π
β114Updated 2 years ago
Alternatives and similar repositories for DinkyTrain:
Users that are interested in DinkyTrain are comparing it to the libraries listed below
- β48Updated 11 months ago
- DEMix Layers for Modular Language Modelingβ53Updated 3 years ago
- β85Updated 2 years ago
- The original Backpack Language Model implementation, a fork of FlashAttentionβ66Updated last year
- β30Updated last year
- reStructured Pre-trainingβ98Updated 2 years ago
- Code for ACL2023 paper: Pre-Training to Learn in Contextβ108Updated 8 months ago
- β61Updated 2 years ago
- [NeurIPS'22 Spotlight] Data and code for our paper CoNT: Contrastive Neural Text Generationβ152Updated last year
- [NeurIPS 2022] "A Win-win Deal: Towards Sparse and Robust Pre-trained Language Models", Yuanxin Liu, Fandong Meng, Zheng Lin, Jiangnan Liβ¦β21Updated last year
- The official repository for the paper "From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning".β64Updated last year
- β116Updated 2 years ago
- A unified benchmark for math reasoningβ87Updated 2 years ago
- This is the oficial repository for "Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts" (EMNLP 2022)β100Updated 2 years ago
- Code for ACL 2023 paper titled "Lifting the Curse of Capacity Gap in Distilling Language Models"β28Updated last year
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuningβ98Updated last year
- Code for Editing Factual Knowledge in Language Modelsβ136Updated 3 years ago
- Repo for the paper "Large Language Models Struggle to Learn Long-Tail Knowledge"β77Updated last year
- Code for the ACL-2022 paper "StableMoE: Stable Routing Strategy for Mixture of Experts"β45Updated 2 years ago
- FairSeq repo with Apollo optimizerβ112Updated last year
- β78Updated 2 years ago
- contrastive decodingβ197Updated 2 years ago
- A framework for few-shot evaluation of autoregressive language models.β103Updated last year
- [NeurIPS 2022] Generating Training Data with Language Models: Towards Zero-Shot Language Understandingβ64Updated 2 years ago
- The official code of EMNLP 2022, "SCROLLS: Standardized CompaRison Over Long Language Sequences".β69Updated last year
- EMNLP 2022: Finding Dataset Shortcuts with Grammar Induction https://arxiv.org/abs/2210.11560β58Updated last month
- Code for the paper "A Theoretical Analysis of the Repetition Problem in Text Generation" in AAAI 2021.β52Updated 2 years ago
- Implementation of ICML 23 Paper: Specializing Smaller Language Models towards Multi-Step Reasoning.β130Updated last year
- Automatic metrics for GEM tasksβ65Updated 2 years ago
- Code for paper 'Data-Efficient FineTuning'β29Updated last year