jason9693 / FROZENLinks
☆14Updated 3 years ago
Alternatives and similar repositories for FROZEN
Users that are interested in FROZEN are comparing it to the libraries listed below
Sorting:
- EasyRLHF aims to provide an easy and minimal interface to train aligned language models, using off-the-shelf solutions and datasets☆9Updated last year
- ☆13Updated 3 years ago
- The official code repository for MetricMT - a reward optimization method for NMT with learned metrics☆25Updated 4 years ago
- Repo for "Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks" ACL 2023 Findings☆16Updated 2 years ago
- Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch☆76Updated 2 years ago
- NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation (ACL-IJCNLP 2021)☆36Updated 4 years ago
- Official code repository of the paper Learning Associative Inference Using Fast Weight Memory by Schlag et al.☆28Updated 4 years ago
- Calculating Expected Time for training LLM.☆38Updated 2 years ago
- ☆29Updated 3 years ago
- A PyTorch Implementation of the Luna: Linear Unified Nested Attention☆41Updated 4 years ago
- ☆30Updated last year
- ☆23Updated last year
- [COLM 2024] Early Weight Averaging meets High Learning Rates for LLM Pre-training☆17Updated 9 months ago
- PyTorch code for "Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention" (WACV 2023)☆33Updated 2 years ago
- This repository contains the code for paper Prompting ELECTRA Few-Shot Learning with Discriminative Pre-Trained Models.☆48Updated 3 years ago
- KETOD Knowledge-Enriched Task-Oriented Dialogue☆32Updated 2 years ago
- exBERT on Transformers🤗☆10Updated 4 years ago
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning☆99Updated 2 years ago
- Adding new tasks to T0 without catastrophic forgetting☆33Updated 2 years ago
- Code for text augmentation method leveraging large-scale language models☆62Updated 3 years ago
- DEMix Layers for Modular Language Modeling☆53Updated 3 years ago
- Official repository for Automated Learning Rate Scheduler for Large-Batch Training (8th ICML Workshop on AutoML)☆40Updated 3 years ago
- [ACL 2023] Gradient Ascent Post-training Enhances Language Model Generalization☆29Updated 10 months ago
- Pretraining summarization models using a corpus of nonsense☆13Updated 3 years ago
- ☆44Updated 4 years ago
- [ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators☆25Updated 2 years ago
- Bilinear Attention Networks for Korean Visual Question Answering☆24Updated last year
- The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".☆33Updated 2 months ago
- "Why do I feel offended?" - Korean Dataset for Offensive Language Identification (EACL2023 Findings)☆15Updated 2 years ago
- Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"☆28Updated 3 years ago