bespokelabsai / curatorLinks
Synthetic data curation for post-training and structured data extraction
☆1,594Updated last week
Alternatives and similar repositories for curator
Users that are interested in curator are comparing it to the libraries listed below
Sorting:
- Recipes to scale inference-time compute of open models☆1,123Updated 7 months ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆2,233Updated last week
- Autonomously train research-agent LLMs on custom data using reinforcement learning and self-verification.☆679Updated 9 months ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆3,039Updated 3 weeks ago
- Automatic evals for LLMs☆574Updated 3 weeks ago
- An Open Source Toolkit For LLM Distillation☆819Updated 3 weeks ago
- [NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards☆1,304Updated 3 weeks ago
- Evaluate your LLM's response with Prometheus and GPT4 💯☆1,029Updated 8 months ago
- A reading list on LLM based Synthetic Data Generation 🔥☆1,496Updated 7 months ago
- An Open Large Reasoning Model for Real-World Solutions☆1,535Updated 7 months ago
- Optimizing inference proxy for LLMs☆3,266Updated 2 weeks ago
- Fully open data curation for reasoning models☆2,183Updated last month
- ☆1,032Updated last year
- Our library for RL environments + evals☆3,730Updated this week
- 🤗 Benchmark Large Language Models Reliably On Your Data☆423Updated 2 weeks ago
- Training Large Language Model to Reason in a Continuous Latent Space☆1,449Updated 5 months ago
- AllenAI's post-training codebase☆3,515Updated this week
- ☆1,173Updated 3 weeks ago
- Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard a…☆2,029Updated last month
- ☆1,344Updated last year
- Tool for generating high quality Synthetic datasets☆1,455Updated 2 months ago
- [COLM 2025] LIMO: Less is More for Reasoning☆1,061Updated 5 months ago
- A library for advanced large language model reasoning☆2,319Updated 7 months ago
- DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤☆1,087Updated 11 months ago
- OLMoE: Open Mixture-of-Experts Language Models☆945Updated 3 months ago
- Arena-Hard-Auto: An automatic LLM benchmark.☆978Updated 6 months ago
- An agent benchmark with tasks in a simulated software company.☆619Updated last month
- [NeurIPS 2025] Atom of Thoughts for Markov LLM Test-Time Scaling☆631Updated last month
- Sky-T1: Train your own O1 preview model within $450☆3,367Updated 6 months ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,812Updated this week