bespokelabsai / curator
Synthetic data curation for post-training and structured data extraction
☆1,209Updated this week
Alternatives and similar repositories for curator:
Users that are interested in curator are comparing it to the libraries listed below
- ☆1,014Updated 3 months ago
- Optimizing inference proxy for LLMs☆2,150Updated last week
- Automatic evals for LLMs☆361Updated this week
- Verifiers for LLM Reinforcement Learning☆790Updated last week
- A reading list on LLM based Synthetic Data Generation 🔥☆1,237Updated last month
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆1,400Updated this week
- ☆541Updated last week
- Recipes to scale inference-time compute of open models☆1,051Updated last month
- Fully open data curation for reasoning models☆1,697Updated last week
- Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard a…☆1,130Updated 3 months ago
- Autonomously train research-agent LLMs on custom data using reinforcement learning and self-verification.☆571Updated 3 weeks ago
- Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL☆1,847Updated this week
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆2,620Updated last week
- ☆849Updated 7 months ago
- procedural reasoning datasets☆559Updated this week
- Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"☆495Updated 3 weeks ago
- Agentless🐱: an agentless approach to automatically solve software development problems☆1,616Updated 3 months ago
- OLMoE: Open Mixture-of-Experts Language Models☆704Updated last month
- Search-o1: Agentic Search-Enhanced Large Reasoning Models☆784Updated 2 weeks ago
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.☆768Updated 2 months ago
- [ICLR 2025] Automated Design of Agentic Systems☆1,250Updated 2 months ago
- An Open Large Reasoning Model for Real-World Solutions☆1,482Updated last month
- [ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data …☆673Updated 3 weeks ago
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆428Updated 6 months ago
- System 2 Reasoning Link Collection☆824Updated 3 weeks ago
- Official Implementation of "KBLaM: Knowledge Base augmented Language Model"☆1,229Updated this week
- Training Large Language Model to Reason in a Continuous Latent Space☆1,051Updated 2 months ago
- A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.☆850Updated last week
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,375Updated 2 weeks ago
- Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"☆1,114Updated last month