yang-zhang / lightning-language-modeling
Language Modeling Example with Transformers and PyTorch Lighting
☆65Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for lightning-language-modeling
- On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines☆132Updated last year
- A library to conduct ranking experiments with transformers.☆161Updated last year
- Shared code for training sentence embeddings with Flax / JAX☆27Updated 3 years ago
- [NAACL 2021] This is the code for our paper `Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self…☆201Updated 2 years ago
- LM Pretraining with PyTorch/TPU☆132Updated 5 years ago
- Implementation of Marge, Pre-training via Paraphrasing, in Pytorch☆75Updated 3 years ago
- A simple and working implementation of Electra, the fastest way to pretrain language models from scratch, in Pytorch☆222Updated last year
- Fine-tune transformers with pytorch-lightning☆44Updated 2 years ago
- State of the art Semantic Sentence Embeddings☆98Updated 2 years ago
- Implementation of the GBST block from the Charformer paper, in Pytorch☆117Updated 3 years ago
- This repository contains the code for "Generating Datasets with Pretrained Language Models".☆187Updated 3 years ago
- [EMNLP 2021] Improving and Simplifying Pattern Exploiting Training☆153Updated 2 years ago
- Distillation of BERT model with catalyst framework☆75Updated last year
- Hyperparameter Search for AllenNLP☆134Updated 4 years ago
- ☆21Updated 3 years ago
- ☆67Updated 3 years ago
- Materials for the EMNLP 2020 Tutorial on "Interpreting Predictions of NLP Models"☆198Updated 3 years ago
- Code for the Shortformer model, from the ACL 2021 paper by Ofir Press, Noah A. Smith and Mike Lewis.☆145Updated 3 years ago
- A benchmark for understanding and evaluating rationales: http://www.eraserbenchmark.com/☆97Updated 2 years ago
- ☆73Updated 3 years ago
- Code and data to support the paper "PAQ 65 Million Probably-Asked Questions andWhat You Can Do With Them"☆202Updated 3 years ago
- A 🤗-style implementation of BERT using lambda layers instead of self-attention☆70Updated 4 years ago
- XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale☆153Updated 11 months ago
- Efficient Attention for Long Sequence Processing☆89Updated 11 months ago
- Implementation of Mixout with PyTorch☆74Updated last year
- Selections from EMNLP 2020☆59Updated 3 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆92Updated last year
- Viewer for the 🤗 datasets library.☆83Updated 3 years ago
- Official code and model checkpoints for our EMNLP 2022 paper "RankGen - Improving Text Generation with Large Ranking Models" (https://arx…☆136Updated last year
- Code for the paper "True Few-Shot Learning in Language Models" (https://arxiv.org/abs/2105.11447)☆142Updated 3 years ago