An implementation of online data mixing for the Pile dataset, based on the GPT-NeoX library.
☆13Jan 9, 2024Updated 2 years ago
Alternatives and similar repositories for online-data-mixing
Users that are interested in online-data-mixing are comparing it to the libraries listed below
Sorting:
- The repository contains code for Adaptive Data Optimization☆32Dec 9, 2024Updated last year
- ☆13Dec 12, 2025Updated 2 months ago
- Official Code Repository for [AutoScale📈: Scale-Aware Data Mixing for Pre-Training LLMs] Published as a conference paper at **COLM 2025*…☆13Aug 8, 2025Updated 7 months ago
- Forcing Diffuse Distributions out of Language Models☆18Sep 10, 2024Updated last year
- ☆109Jul 15, 2025Updated 7 months ago
- Code for "Tracing Knowledge in Language Models Back to the Training Data"☆39Dec 27, 2022Updated 3 years ago
- Exploration of automated dataset selection approaches at large scales.☆52Mar 4, 2025Updated last year
- Learning to route instances for Human vs AI Feedback (ACL Main '25)☆27Jul 23, 2025Updated 7 months ago
- This is the official implementation for our ACL 2024 paper: "Causal Estimation of Memorisation Profiles".☆24Mar 25, 2025Updated 11 months ago
- ☆14Jun 19, 2024Updated last year
- Few-shot Learning with Auxiliary Data☆31Dec 8, 2023Updated 2 years ago
- [NeurIPS 2023] Github repository for "Composing Parameter-Efficient Modules with Arithmetic Operations"☆61Nov 26, 2023Updated 2 years ago
- A Survey on Data Selection for Language Models☆254Apr 29, 2025Updated 10 months ago
- Official repository for MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models [NeurIPS 2024]☆79Nov 14, 2024Updated last year
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆82Apr 11, 2024Updated last year
- A Data-Driven Approach to Predict the Success of Bank Telemarketing☆10Apr 27, 2021Updated 4 years ago
- ☆52Oct 23, 2023Updated 2 years ago
- Implementation of Beyond Neural Scaling beating power laws for deep models and prototype-based models☆34Oct 30, 2025Updated 4 months ago
- Debiasing Through Data Attribution☆12May 23, 2024Updated last year
- ☆10Jul 16, 2023Updated 2 years ago
- Code for COLING 2022 accepted paper titled "MuCDN: Mutual Conversational Detachment Network for Emotion Recognition in Multi-Party Conver…☆10Jul 21, 2023Updated 2 years ago
- [ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning☆10Apr 28, 2023Updated 2 years ago
- ☆14Aug 28, 2024Updated last year
- Guide to interviewing for industry machine learning roles (data/applied/research scientist, ML engineer, etc).☆11Dec 28, 2022Updated 3 years ago
- A collection of demos and utilities prepared ahead of the Vector Institute Privacy Enhancing Techniques (PETs) Bootcamp.☆15Sep 22, 2022Updated 3 years ago
- Official PyTorch Implementation of EMoE: Unlocking Emergent Modularity in Large Language Models [main conference @ NAACL2024]☆39May 28, 2024Updated last year
- Official Repository for Dataset Inference for LLMs☆42Jul 25, 2024Updated last year
- f-PO: Generalizing Preference Optimization with f-divergence Minimization☆13Apr 2, 2025Updated 11 months ago
- [AAAI26] Trade-offs in Large Reasoning Models: An Empirical Analysis of Deliberative and Adaptive Reasoning over Foundational Capabilitie…☆10Feb 7, 2026Updated last month
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 7 months ago
- Tiny evaluation of leading LLMs on competitive programming problems☆14Nov 28, 2024Updated last year
- [Ongoing Project] Codebase for network quantization study.☆12May 20, 2020Updated 5 years ago
- A matlab package for analyzing chaotic properties of time series data☆11Jun 29, 2018Updated 7 years ago
- Repo containing few notebooks on fine tuning of Language Models☆13Apr 29, 2024Updated last year
- Graphical user interface for text-guided face editing☆11Jan 18, 2023Updated 3 years ago
- ☆30Jan 8, 2026Updated 2 months ago
- List Flower resources☆12Feb 4, 2022Updated 4 years ago
- 【ICME2025 Oral】Offical Pytorch Code for "Fraesormer: Learning Adaptive Sparse Transformer for Efficient Food Recognition"☆11Mar 21, 2025Updated 11 months ago
- a jax benchmark for ad hoc teamwork☆19Updated this week