TalSchuster / CATsLinks

Confident Adaptive Transformers

☆12

Alternatives and similar repositories for CATs

Users that are interested in CATs are comparing it to the libraries listed below

Sorting:

frankxu2004 / knnlm-why
Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"
☆58Updated 2 years ago
microsoft / AMOS
[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators
☆24Updated last year
srush / transformers-bet
☆12Updated 3 years ago
timvieira / vocrf
Variable-order CRFs with structure learning
☆16Updated 11 months ago
GChrysostomou / ood_faith
☆13Updated last year
allenai / staged-training
Staged Training for Transformer Language Models
☆32Updated 3 years ago
jungokasai / deep-shallow
☆44Updated 4 years ago
RobertCsordas / ndr
The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".
☆33Updated last month
acmi-lab / pretraining-with-nonsense
Pretraining summarization models using a corpus of nonsense
☆13Updated 3 years ago
xiamengzhou / NLPerf
Performance Prediction for NLP Tasks
☆16Updated 5 years ago
sustcsonglin / gated_linear_attention_layer
☆32Updated last year
machelreid / editpro
Learning to Model Editing Processes
☆26Updated 3 years ago
harvardnlp / hmm-lm
☆41Updated 4 years ago
jenni-ai / T2FW
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
☆19Updated 2 years ago
frankaging / Causal-Distill
The Codebase for Causal Distillation for Language Models (NAACL '22)
☆25Updated 3 years ago
da03 / criticize_text_generation
A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …
☆11Updated 2 years ago
JunShern / few-shot-adaptation
Exploring Few-Shot Adaptation of Language Models with Tables
☆24Updated 2 years ago
zomux / lanmt-ebm
lanmt ebm
☆12Updated 5 years ago
seraphlabs-ca / SentenceMIM-demo
This repo contains code to reproduce some of the results presented in the paper "SentenceMIM: A Latent Variable Language Model"
☆28Updated 3 years ago
jxhe / sparse-text-prototype
PyTorch Implementation of NeurIPS 2020 paper "Learning Sparse Prototypes for Text Generation"
☆22Updated 4 years ago
cliang1453 / CAMERO
CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing (ACL 2022)
☆10Updated 3 years ago
jungokasai / twist_decoding
☆29Updated 3 years ago
Yuanhy1997 / HyPe
HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation [ACL 2023]
☆14Updated 2 years ago
ThomasScialom / T0_continual_learning
Adding new tasks to T0 without catastrophic forgetting
☆33Updated 2 years ago
jungokasai / beam_with_patience
☆46Updated 3 years ago
allenai / dream
☆24Updated 10 months ago
renll / SparseLT
[EMNLP 2022] Language Model Pre-Training with Sparse Latent Typing
☆14Updated 2 years ago
cliang1453 / SAGE
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)
☆30Updated 3 years ago
Ankush7890 / ssfinetuning
A package for fine tuning of pretrained NLP transformers using Semi Supervised Learning
☆14Updated 3 years ago
RUCAIBox / MPOP
☆13Updated 4 years ago