elianap / divexplorer
ā11Updated 2 years ago
Alternatives and similar repositories for divexplorer:
Users that are interested in divexplorer are comparing it to the libraries listed below
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.ā77Updated 4 months ago
- š¤ Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.ā82Updated 2 years ago
- ā127Updated 2 years ago
- PyTorch reimplementation of REALM and ORQAā22Updated 2 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.ā93Updated last year
- ā20Updated 2 years ago
- ā73Updated last year
- Evaluation pipeline for the BabyLM Challenge 2023.ā75Updated last year
- Ensembling Hugging Face transformers made easyā63Updated 2 years ago
- Code for "Tracing Knowledge in Language Models Back to the Training Data"ā37Updated 2 years ago
- ā11Updated last year
- Code associated with the paper "Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists"ā47Updated 2 years ago
- babyLM WhisBERT codeā18Updated 8 months ago
- My explorations into editing the knowledge and memories of an attention networkā34Updated 2 years ago
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuningā97Updated last year
- A library for parameter-efficient and composable transfer learning for NLP with sparse fine-tunings.ā71Updated 5 months ago
- ā10Updated last month
- Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorchā97Updated last year
- No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)ā30Updated 2 years ago
- Measuring the Mixing of Contextual Information in the Transformerā27Updated last year
- M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyerā55Updated 2 years ago
- ITALIC: An ITALian Intent Classification Datasetā11Updated last year
- Randomized Positional Encodings Boost Length Generalization of Transformersā79Updated 10 months ago
- Code and data for the paper "Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?"ā24Updated last month
- Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"ā28Updated 2 years ago
- Apps built using Inspired Cognition's Critique.ā58Updated last year
- Interpretable unified language safety checking with large language modelsā30Updated last year
- Tutorial to pretrain & fine-tune a š¤ Flax T5 model on a TPUv3-8 with GCPā58Updated 2 years ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning Pā¦ā34Updated last year
- Minimum Bayes Risk Decoding for Hugging Face Transformersā56Updated 7 months ago