pytorch-tpu / transformers
๐ค Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.
โ17Updated last month
Alternatives and similar repositories for transformers:
Users that are interested in transformers are comparing it to the libraries listed below
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.โ93Updated 2 years ago
- Tutorial to pretrain & fine-tune a ๐ค Flax T5 model on a TPUv3-8 with GCPโ58Updated 2 years ago
- A Framework aims to wisely initialize unseen subword embeddings in PLMs for efficient large-scale continued pretrainingโ16Updated last year
- โ97Updated 2 years ago
- Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learningโ30Updated 2 years ago
- โ72Updated last year
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.โ80Updated 7 months ago
- Exploring finetuning public checkpoints on filter 8K sequences on Pileโ115Updated 2 years ago
- Calculating Expected Time for training LLM.โ38Updated 2 years ago
- Pytorch Implementation of EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasksโ63Updated 3 years ago
- BLOOM+1: Adapting BLOOM model to support a new unseen languageโ71Updated last year
- โ72Updated last year
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.โ75Updated 3 years ago
- Code for Zero-Shot Tokenizer Transferโ127Updated 3 months ago
- Anh - LAION's multilingual assistant datasets and modelsโ27Updated 2 years ago
- Transformers at any scaleโ41Updated last year
- โ38Updated last year
- โ24Updated 2 years ago
- โ77Updated last year
- Megatron LM 11B on Huggingface Transformersโ27Updated 3 years ago
- โ21Updated 2 years ago
- [TMLR'23] Contrastive Search Is What You Need For Neural Text Generationโ119Updated 2 years ago
- Official Code for M-Rแดแดกแดสแด Bแดษดแดส: Evaluating Reward Models in Multilingual Settingsโ28Updated 2 months ago
- โ44Updated 4 years ago
- Pre-training BART in Flax on The Pile datasetโ21Updated 3 years ago
- SILO Language Models code repositoryโ81Updated last year
- Evaluation pipeline for the BabyLM Challenge 2023.โ75Updated last year
- [EMNLP'23] Official Code for "FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models"โ30Updated 6 months ago
- data related codebase for polyglot projectโ19Updated 2 years ago
- Pytorch/XLA SPMD Test code in Google TPUโ23Updated last year