pytorch-tpu / transformersLinks
π€ Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.
β17Updated last month
Alternatives and similar repositories for transformers
Users that are interested in transformers are comparing it to the libraries listed below
Sorting:
- A Framework aims to wisely initialize unseen subword embeddings in PLMs for efficient large-scale continued pretrainingβ17Updated last year
- β100Updated 2 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.β93Updated 2 years ago
- Calculating Expected Time for training LLM.β38Updated 2 years ago
- A minimal PyTorch Lightning OpenAI GPT w DeepSpeed Training!β112Updated 2 years ago
- Pytorch Implementation of EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasksβ63Updated 3 years ago
- The official implemetation of "Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks" (NAACL 2022).β44Updated 2 years ago
- This repository contains the code for paper Prompting ELECTRA Few-Shot Learning with Discriminative Pre-Trained Models.β48Updated 3 years ago
- Tutorial to pretrain & fine-tune a π€ Flax T5 model on a TPUv3-8 with GCPβ58Updated 2 years ago
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuningβ98Updated 2 years ago
- [TMLR'23] Contrastive Search Is What You Need For Neural Text Generationβ119Updated 2 years ago
- Long-context pretrained encoder-decoder modelsβ95Updated 2 years ago
- PyTorch reimplementation of REALM and ORQAβ22Updated 3 years ago
- Train Dense Passage Retriever (DPR) with a single GPUβ133Updated 4 years ago
- Script to pre-train hugginface transformers BART with Tensorflow 2β33Updated 2 years ago
- Implementation of stop sequencer for Huggingface Transformersβ16Updated 2 years ago
- [ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learnersβ116Updated 3 weeks ago
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.β82Updated 10 months ago
- β72Updated 2 years ago
- Train π€transformers with DeepSpeed: ZeRO-2, ZeRO-3β23Updated 4 years ago
- β10Updated 2 years ago
- Megatron LM 11B on Huggingface Transformersβ28Updated 4 years ago
- β19Updated 2 years ago
- Tools for managing datasets for governance and training.β85Updated last month
- KETOD Knowledge-Enriched Task-Oriented Dialogueβ32Updated 2 years ago
- The official code of EMNLP 2022, "SCROLLS: Standardized CompaRison Over Long Language Sequences".β70Updated last year
- The official repository for Efficient Long-Text Understanding Using Short-Text Models (Ivgi et al., 2022) paperβ69Updated 2 years ago
- Pre-training BART in Flax on The Pile datasetβ21Updated 3 years ago
- β97Updated 2 years ago
- DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization (ACL 2022)β50Updated 2 years ago