locuslab / scaling_laws_data_filtering
β64Updated last year
Alternatives and similar repositories for scaling_laws_data_filtering:
Users that are interested in scaling_laws_data_filtering are comparing it to the libraries listed below
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"β47Updated last year
- Codebase for Instruction Following without Instruction Tuningβ34Updated 6 months ago
- [NeurIPS-2024] π Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623β82Updated 6 months ago
- Exploration of automated dataset selection approaches at large scales.β37Updated last month
- β98Updated 6 months ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modelingβ47Updated 3 months ago
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Modelsβ49Updated 2 months ago
- A dataset of LLM-generated chain-of-thought steps annotated with mistake location.β80Updated 8 months ago
- Long Context Extension and Generalization in LLMsβ53Updated 6 months ago
- [NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With Loss Over Instructions"β36Updated 10 months ago
- Scalable Meta-Evaluation of LLMs as Evaluatorsβ42Updated last year
- A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Modelsβ46Updated last month
- Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"β74Updated 10 months ago
- [ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Modelsβ76Updated last year
- Code for preprint "Metadata Conditioning Accelerates Language Model Pre-training (MeCo)"β37Updated 3 weeks ago
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimizationβ31Updated last month
- DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrailsβ20Updated last month
- β45Updated last month
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433β25Updated 4 months ago
- Reproduction of "RLCD Reinforcement Learning from Contrast Distillation for Language Model Alignmentβ68Updated last year
- Official Repository of Are Your LLMs Capable of Stable Reasoning?β25Updated 3 weeks ago
- Automatic prompt optimization framework for multi-step agent tasks.β29Updated 5 months ago
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"β50Updated 2 weeks ago
- The code and data for the paper JiuZhang3.0β43Updated 10 months ago
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"β58Updated last year
- This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or reβ¦β27Updated 6 months ago
- [NeurIPS 2024 Spotlight] Code and data for the paper "Finding Transformer Circuits with Edge Pruning".β48Updated last month
- β59Updated 7 months ago
- Code for Paper: Teaching Language Models to Critique via Reinforcement Learningβ90Updated this week
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Schedulingβ28Updated 3 weeks ago