libeineu / ODE-Transformer
This is a code repository for the ACL 2022 paper "ODE Transformer: An Ordinary Differential Equation-Inspired Model for Sequence Generation", which redesigns the Transformer architecture from the ODE perspective via using high-order ODE solvers to enhance the residual connections.
☆28Updated last year
Related projects: ⓘ
- Implementation for Variational Information Bottleneck for Effective Low-resource Fine-tuning, ICLR 2021☆36Updated 3 years ago
- Resources for Retrieval Augmentation for Commonsense Reasoning: A Unified Approach. EMNLP 2022.☆19Updated last year
- A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆68Updated last year
- Code for Residual Energy-Based Models for Text Generation in PyTorch.☆22Updated 3 years ago
- Learning to Encode Position for Transformer with Continuous Dynamical Model☆59Updated 4 years ago
- Code for the ACL-2022 paper "StableMoE: Stable Routing Strategy for Mixture of Experts"☆41Updated 2 years ago
- [EMNLP 2022] Code for our paper “ZeroGen: Efficient Zero-shot Learning via Dataset Generation”.☆46Updated 2 years ago
- ☆28Updated last year
- ☆48Updated last year
- ☆65Updated last year
- Methods and evaluation for aligning language models temporally☆24Updated 6 months ago
- No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)☆29Updated 2 years ago
- [EMNLP 2021] Dataset and PyTorch Code for ExplaGraphs: An Explanation Graph Generation Task for Structured Commonsense Reasoning☆10Updated last year
- ☆13Updated 2 years ago
- ☆25Updated 2 years ago
- ☆36Updated 5 months ago
- ☆32Updated 3 years ago
- [EMNLP 2022] Language Model Pre-Training with Sparse Latent Typing☆15Updated last year
- EMNLP'2022: BERTScore is Unfair: On Social Bias in Language Model-Based Metrics for Text Generation☆38Updated last year
- Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data☆56Updated 3 years ago
- [ACL 2023 Findings] What In-Context Learning “Learns” In-Context: Disentangling Task Recognition and Task Learning☆21Updated last year
- Source code for SIGIR 2022 paper.☆15Updated 2 years ago
- Language modeling via stochastic processes. Oral @ ICLR 2022.☆134Updated last year
- EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling☆31Updated 2 years ago
- ☆22Updated 3 years ago
- Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)☆71Updated 2 years ago
- Repository for ACL 2022 paper Mix and Match: Learning-free Controllable Text Generation using Energy Language Models☆39Updated 2 years ago
- Implementation of QKVAE☆11Updated last year
- Code for Paper: Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data☆35Updated 3 years ago
- [EMNLP 2021] MuVER: Improving First-Stage Entity Retrieval with Multi-View Entity Representations☆30Updated 2 years ago