srush / transformers-bet
☆12Updated 3 years ago
Alternatives and similar repositories for transformers-bet:
Users that are interested in transformers-bet are comparing it to the libraries listed below
- Exploring Few-Shot Adaptation of Language Models with Tables☆23Updated 2 years ago
- A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …☆11Updated 2 years ago
- Standalone pre-training recipe with JAX+Flax☆31Updated 2 years ago
- interactive explorer for language models☆9Updated 5 years ago
- An attempt to merge ESBN with Transformers, to endow Transformers with the ability to emergently bind symbols☆15Updated 3 years ago
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆56Updated 2 years ago
- Suite of 500 procedurally-generated NLP tasks to study language model adaptability☆21Updated 2 years ago
- Variable-order CRFs with structure learning☆16Updated 8 months ago
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weights☆19Updated 2 years ago
- Embedding Recycling for Language models☆38Updated last year
- A python library for highly configurable transformers - easing model architecture search and experimentation.☆49Updated 3 years ago
- Official codebase accompanying our ACL 2022 paper "RELiC: Retrieving Evidence for Literary Claims" (https://relic.cs.umass.edu).☆20Updated 2 years ago
- Code and pre-trained models for "ReasonBert: Pre-trained to Reason with Distant Supervision", EMNLP'2021☆29Updated 2 years ago
- ☆38Updated last year
- Unifew: Unified Fewshot Learning Model☆18Updated 3 years ago
- Code repo for "Transformer on a Diet" paper☆31Updated 4 years ago
- Few-shot Learning with Auxiliary Data☆27Updated last year
- ☆16Updated 11 months ago
- Unofficially Implements https://arxiv.org/abs/2112.05682 to get Linear Memory Cost on Attention for PyTorch☆12Updated 3 years ago
- Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"☆28Updated 2 years ago
- ☆38Updated 4 years ago
- Learning to Model Editing Processes☆26Updated 2 years ago
- ☆13Updated 4 years ago
- Combining encoder-based language models☆11Updated 3 years ago
- ☆22Updated 3 years ago
- Code for the paper SciCo: Hierarchical Cross-Document Coreference for Scientific Concepts (AKBC 2021). https://openreview.net/forum?id=OF…☆29Updated 3 years ago
- [NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling☆36Updated last year
- This repo contains code to reproduce some of the results presented in the paper "SentenceMIM: A Latent Variable Language Model"☆28Updated 2 years ago
- The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".☆33Updated 3 years ago
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Updated last year