Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.
☆97Feb 9, 2023Updated 3 years ago
Alternatives and similar repositories for olm-training
Users that are interested in olm-training are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Pipeline for pulling and processing online language model pretraining data from the web☆179Jul 31, 2023Updated 2 years ago
- LTG-Bert☆34Jan 8, 2024Updated 2 years ago
- A tiny BERT for low-resource monolingual models☆31Dec 24, 2025Updated 4 months ago
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆59Jan 12, 2023Updated 3 years ago
- Experiments for XLM-V Transformers Integeration☆13Feb 8, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- **ARCHIVED** Filesystem interface to 🤗 Hub☆59Apr 6, 2023Updated 3 years ago
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning☆99Apr 26, 2023Updated 3 years ago
- decontamination☆33Mar 4, 2026Updated 2 months ago
- Experiments for efforts to train a new and improved t5☆76Apr 15, 2024Updated 2 years ago
- ☆16Mar 3, 2024Updated 2 years ago
- An open collection of implementation tips, tricks and resources for training large language models☆499Mar 8, 2023Updated 3 years ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆29Apr 17, 2024Updated 2 years ago
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Aug 5, 2023Updated 2 years ago
- LARCH: Large Language Model-based Automatic Readme Creation with Heuristics☆17Jul 1, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code repo for "Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers" (ACL 2023)☆22Nov 1, 2023Updated 2 years ago
- Local emulator for Hugging Face Inference Endpoints customer handlers☆27Apr 3, 2026Updated last month
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆23Mar 12, 2024Updated 2 years ago
- ☆30Sep 27, 2021Updated 4 years ago
- A Streamlit app to add structured tags to a dataset card☆22Jun 30, 2022Updated 3 years ago
- Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering☆174Jun 6, 2021Updated 4 years ago
- [ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training☆23Aug 18, 2024Updated last year
- A repository to get acquainted with basic training tasks in natural language processing and machine learning☆11Dec 27, 2023Updated 2 years ago
- ☆13Mar 27, 2020Updated 6 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Long-context pretrained encoder-decoder models☆96Oct 28, 2022Updated 3 years ago
- ☆102Dec 17, 2022Updated 3 years ago
- A package for fine tuning of pretrained NLP transformers using Semi Supervised Learning☆14Oct 27, 2021Updated 4 years ago
- Staged Training for Transformer Language Models☆33Mar 31, 2022Updated 4 years ago
- Official code and model checkpoints for our EMNLP 2022 paper "RankGen - Improving Text Generation with Large Ranking Models" (https://arx…☆139Aug 2, 2023Updated 2 years ago
- Hugging Face and Pyserini interoperability☆19May 18, 2023Updated 3 years ago
- Adding new tasks to T0 without catastrophic forgetting☆33Oct 20, 2022Updated 3 years ago
- Fast & Simple repository for pre-training and fine-tuning T5-style models☆1,019Aug 21, 2024Updated last year
- This repository contains the code for paper Prompting ELECTRA Few-Shot Learning with Discriminative Pre-Trained Models.☆48Jun 7, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Code Roberta version of RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder☆10Mar 16, 2023Updated 3 years ago
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆90Sep 12, 2024Updated last year
- PyTorch + HuggingFace code for RetoMaton: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022), including an…☆286Oct 20, 2022Updated 3 years ago
- Efficient few-shot learning with Sentence Transformers☆2,741Apr 17, 2026Updated last month
- Scaling Data-Constrained Language Models☆342Jun 28, 2025Updated 10 months ago
- Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning☆33Jan 9, 2025Updated last year
- ☆50Mar 14, 2024Updated 2 years ago