elianap / divexplorer
☆11Updated 2 years ago
Alternatives and similar repositories for divexplorer:
Users that are interested in divexplorer are comparing it to the libraries listed below
- Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP☆58Updated 2 years ago
- Code associated with the paper "Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists"☆48Updated 2 years ago
- Do Multilingual Language Models Think Better in English?☆41Updated last year
- RATransformers 🐭- Make your transformer (like BERT, RoBERTa, GPT-2 and T5) Relation Aware!☆41Updated 2 years ago
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆80Updated 7 months ago
- M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer☆55Updated 2 years ago
- Pre-training BART model for the Italian Language☆15Updated 2 years ago
- ☆72Updated last year
- Minimum Bayes Risk Decoding for Hugging Face Transformers☆57Updated 10 months ago
- Official code for Wav2Seq☆96Updated 2 years ago
- A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021☆47Updated 3 years ago
- No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)☆30Updated 3 years ago
- PyTorch reimplementation of REALM and ORQA☆22Updated 3 years ago
- Code for "Tracing Knowledge in Language Models Back to the Training Data"☆37Updated 2 years ago
- A curated list of research papers and resources on Cultural LLM.☆41Updated 7 months ago
- 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.☆82Updated 3 years ago
- [NAACL 2022] GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformers☆21Updated last year
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆26Updated last year
- A library for parameter-efficient and composable transfer learning for NLP with sparse fine-tunings.☆71Updated 8 months ago
- ☆72Updated 11 months ago
- Measuring the Mixing of Contextual Information in the Transformer☆29Updated last year
- ☆128Updated 2 years ago
- ☆38Updated last year
- My explorations into editing the knowledge and memories of an attention network☆34Updated 2 years ago
- The InterScript dataset contains interactive user feedback on scripts generated by a T5-XXL model.☆11Updated 3 years ago
- ITALIC: An ITALian Intent Classification Dataset☆12Updated last year
- ☆51Updated last year
- Randomized Positional Encodings Boost Length Generalization of Transformers☆80Updated last year
- Evaluation pipeline for the BabyLM Challenge 2023.☆75Updated last year
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆108Updated last year