cybertronai / bflm
☆16Updated 5 years ago
Related projects: ⓘ
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆56Updated last year
- ☆14Updated last year
- Performance Prediction for NLP Tasks☆16Updated 4 years ago
- ☆13Updated 5 years ago
- ☆13Updated 3 years ago
- ☆16Updated 10 months ago
- Standalone pre-training recipe with JAX+Flax☆31Updated last year
- Helper scripts and notes that were used while porting various nlp models☆44Updated 2 years ago
- ☆47Updated 4 years ago
- Staged Training for Transformer Language Models☆28Updated 2 years ago
- Using FlexAttention to compute attention with different masking patterns☆28Updated last week
- Pretraining summarization models using a corpus of nonsense☆13Updated 2 years ago
- ☆12Updated 2 years ago
- ☆13Updated this week
- This repository contains some of the code used in the paper "Training Language Models with Langauge Feedback at Scale"☆26Updated last year
- ☆46Updated 2 years ago
- Learning to Model Editing Processes☆26Updated 2 years ago
- Transformers at any scale☆39Updated 8 months ago
- lanmt ebm☆11Updated 4 years ago
- ☆19Updated last year
- Exploring Few-Shot Adaptation of Language Models with Tables☆23Updated 2 years ago
- Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"☆28Updated 2 years ago
- Code repo for "Transformer on a Diet" paper☆31Updated 4 years ago
- ☆92Updated last year
- Improving Neural Text Generation with Reinforcement Learning☆21Updated 3 years ago
- ☆75Updated 9 months ago
- ☆11Updated 2 years ago
- ☆31Updated last year
- A diff tool for language models☆42Updated 8 months ago
- ☆42Updated 3 years ago