bespokelabsai / curatorLinks
Synthetic data curation for post-training and structured data extraction
☆1,618Updated last week
Alternatives and similar repositories for curator
Users that are interested in curator are comparing it to the libraries listed below
Sorting:
- Recipes to scale inference-time compute of open models☆1,124Updated 8 months ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆3,074Updated last week
- Optimizing inference proxy for LLMs☆3,299Updated this week
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆2,279Updated last week
- Autonomously train research-agent LLMs on custom data using reinforcement learning and self-verification.☆682Updated 10 months ago
- [NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards☆1,326Updated 2 weeks ago
- ☆1,033Updated last year
- Automatic evals for LLMs☆578Updated last month
- An Open Source Toolkit For LLM Distillation☆846Updated last month
- A reading list on LLM based Synthetic Data Generation 🔥☆1,512Updated 7 months ago
- DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤☆1,088Updated last year
- An Open Large Reasoning Model for Real-World Solutions☆1,533Updated 8 months ago
- [COLM 2025] LIMO: Less is More for Reasoning☆1,061Updated 6 months ago
- Our library for RL environments + evals☆3,791Updated this week
- Tool for generating high quality Synthetic datasets☆1,476Updated 3 months ago
- Training Large Language Model to Reason in a Continuous Latent Space☆1,491Updated 5 months ago
- AllenAI's post-training codebase☆3,551Updated this week
- ☆1,385Updated 4 months ago
- ☆1,189Updated last month
- Code and Data for Tau-Bench☆1,079Updated 5 months ago
- Arena-Hard-Auto: An automatic LLM benchmark.☆991Updated 7 months ago
- Evaluate your LLM's response with Prometheus and GPT4 💯☆1,038Updated 9 months ago
- A library for advanced large language model reasoning☆2,326Updated 7 months ago
- Fully open data curation for reasoning models☆2,200Updated 2 months ago
- AIDE: AI-Driven Exploration in the Space of Code. The machine Learning engineering agent that automates AI R&D.☆1,123Updated 2 months ago
- [NeurIPS 2025] Atom of Thoughts for Markov LLM Test-Time Scaling☆640Updated 2 months ago
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering☆1,295Updated 2 weeks ago
- Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard a…☆2,044Updated last month
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆843Updated last week
- 🤗 Benchmark Large Language Models Reliably On Your Data☆425Updated last month