☆17Mar 23, 2025Updated last year
Alternatives and similar repositories for GREATS
Users that are interested in GREATS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official Code Repository for [AutoScale📈: Scale-Aware Data Mixing for Pre-Training LLMs] Published as a conference paper at **COLM 2025*…☆13Aug 8, 2025Updated 7 months ago
- Less is More: High-value Data Selection for Visual Instruction Tuning☆17Jan 18, 2025Updated last year
- Organize the Web: Constructing Domains Enhances Pre-Training Data Curation☆80May 2, 2025Updated 10 months ago
- This is an official repository for "Performance Scaling via Optimal Transport: Enabling Data Selection from Partially Revealed Sources" (…☆14Oct 26, 2023Updated 2 years ago
- ☆10Oct 20, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Introduction about AWESOME_ENTROPY+LRM_PAPERS☆30Dec 16, 2025Updated 3 months ago
- ☆33Feb 11, 2025Updated last year
- Codebase for ICML submission "DOGE: Domain Reweighting with Generalization Estimation"☆21Feb 29, 2024Updated 2 years ago
- ☆52Jan 24, 2024Updated 2 years ago
- Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models☆48Oct 31, 2023Updated 2 years ago
- `dattri` is a PyTorch library for developing, benchmarking, and deploying efficient data attribution algorithms.☆120Updated this week
- AI Logging for Interpretability and Explainability🔬☆140Jun 7, 2024Updated last year
- Exploration of automated dataset selection approaches at large scales.☆53Mar 4, 2025Updated last year