The Codebase for Causal Distillation for Language Models (NAACL '22)
☆26May 1, 2022Updated 4 years ago
Alternatives and similar repositories for Causal-Distill
Users that are interested in Causal-Distill are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [EMNLP 2022] Language Model Pre-Training with Sparse Latent Typing☆14Feb 10, 2023Updated 3 years ago
- This repository includes the masking vocabulary used in the ICLR 2021 spotlight PMI-Masking paper☆14Aug 9, 2021Updated 4 years ago
- Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"