huggingface / datablationsLinks
Scaling Data-Constrained Language Models
☆334Updated 8 months ago
Alternatives and similar repositories for datablations
Users that are interested in datablations are comparing it to the libraries listed below
Sorting:
- DSIR large-scale data selection framework for language model training☆249Updated last year
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆254Updated last year
- Simple next-token-prediction for RLHF☆226Updated last year
- Manage scalable open LLM inference endpoints in Slurm clusters☆257Updated 10 months ago
- Multipack distributed sampler for fast padding-free training of LLMs☆188Updated 9 months ago
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆202Updated last year
- ☆159Updated 2 years ago
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind☆178Updated 8 months ago
- Self-Alignment with Principle-Following Reward Models☆161Updated 3 weeks ago
- A repository for research on medium sized language models.☆495Updated 3 weeks ago
- This project studies the performance and robustness of language models and task-adaptation methods.☆150Updated last year
- Pre-training code for Amber 7B LLM☆166Updated last year
- Learning to Compress Prompts with Gist Tokens - https://arxiv.org/abs/2304.08467☆285Updated 3 months ago
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"☆302Updated last year
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆461Updated last year
- Language models scale reliably with over-training and on downstream tasks☆97Updated last year
- ☆258Updated last year
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆396Updated last year
- Code and data for "Lost in the Middle: How Language Models Use Long Contexts"☆344Updated last year
- This is the repo for the paper Shepherd -- A Critic for Language Model Generation☆218Updated last year
- [EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning☆241Updated last year
- Reproducible, flexible LLM evaluations☆203Updated 3 weeks ago
- Code repository for the c-BTM paper☆106Updated last year
- The official evaluation suite and dynamic data release for MixEval.☆241Updated 6 months ago
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.☆417Updated last year
- Recurrent Memory Transformer☆148Updated last year
- Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets☆328Updated last year
- Code used for the creation of OBELICS, an open, massive and curated collection of interleaved image-text web documents, containing 141M d…☆202Updated 9 months ago
- A framework for few-shot evaluation of autoregressive language models.☆102Updated 2 years ago
- RLHF implementation details of OAI's 2019 codebase☆187Updated last year