Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.
☆96Feb 9, 2023Updated 3 years ago
Alternatives and similar repositories for olm-training
Users that are interested in olm-training are comparing it to the libraries listed below
Sorting:
- Pipeline for pulling and processing online language model pretraining data from the web☆177Jul 31, 2023Updated 2 years ago
- LTG-Bert☆34Jan 8, 2024Updated 2 years ago
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning☆98Apr 26, 2023Updated 2 years ago
- **ARCHIVED** Filesystem interface to 🤗 Hub☆59Apr 6, 2023Updated 2 years ago
- An open collection of implementation tips, tricks and resources for training large language models☆498Mar 8, 2023Updated 2 years ago
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆59Jan 12, 2023Updated 3 years ago
- A tiny BERT for low-resource monolingual models☆31Dec 24, 2025Updated 2 months ago
- decontamination☆26Dec 3, 2025Updated 3 months ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆28Apr 17, 2024Updated last year
- Local emulator for Hugging Face Inference Endpoints customer handlers☆27Jul 25, 2023Updated 2 years ago
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆23Mar 12, 2024Updated last year
- Experiments for XLM-V Transformers Integeration☆13Feb 8, 2023Updated 3 years ago
- ☆16Mar 3, 2024Updated 2 years ago
- A Streamlit app to add structured tags to a dataset card☆22Jun 30, 2022Updated 3 years ago
- Simplified recipes for preparing commonly used speech datasets, and a PyTorch-compatible Python data loader that can perform standard fea…☆15Jun 12, 2023Updated 2 years ago
- A package for fine tuning of pretrained NLP transformers using Semi Supervised Learning☆14Oct 27, 2021Updated 4 years ago
- Official code and model checkpoints for our EMNLP 2022 paper "RankGen - Improving Text Generation with Large Ranking Models" (https://arx…☆137Aug 2, 2023Updated 2 years ago
- ☆30Sep 27, 2021Updated 4 years ago
- Staged Training for Transformer Language Models☆33Mar 31, 2022Updated 3 years ago
- This repository contains the code for paper Prompting ELECTRA Few-Shot Learning with Discriminative Pre-Trained Models.☆48Jun 7, 2022Updated 3 years ago
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Aug 5, 2023Updated 2 years ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆32Sep 19, 2025Updated 5 months ago
- Experiments for efforts to train a new and improved t5☆76Apr 15, 2024Updated last year
- Adding new tasks to T0 without catastrophic forgetting☆33Oct 20, 2022Updated 3 years ago
- Long-context pretrained encoder-decoder models☆96Oct 28, 2022Updated 3 years ago
- Hugging Face and Pyserini interoperability☆19May 18, 2023Updated 2 years ago
- Scaling Data-Constrained Language Models☆342Jun 28, 2025Updated 8 months ago
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…☆34Aug 24, 2024Updated last year
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆224Dec 16, 2025Updated 2 months ago
- Code repo for "Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers" (ACL 2023)☆22Nov 1, 2023Updated 2 years ago
- Efficient few-shot learning with Sentence Transformers☆2,688Dec 11, 2025Updated 2 months ago
- ☆50Mar 14, 2024Updated last year
- 🤗 Disaggregators: Curated data labelers for in-depth analysis.☆68Feb 8, 2023Updated 3 years ago
- Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering☆175Jun 6, 2021Updated 4 years ago
- Prune a model while finetuning or training.☆406Jun 21, 2022Updated 3 years ago
- Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning☆33Jan 9, 2025Updated last year
- Arabic News Stance Corpus☆11Feb 5, 2021Updated 5 years ago
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆88Sep 12, 2024Updated last year
- Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀☆1,687Oct 23, 2024Updated last year