bespokelabsai / curator
Synthetic Data curation for post-training and structured data extraction
☆816Updated this week
Alternatives and similar repositories for curator:
Users that are interested in curator are comparing it to the libraries listed below
- Recipes to scale inference-time compute of open models☆1,002Updated last month
- Automatic Evals for LLMs☆266Updated this week
- ☆1,006Updated 2 months ago
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆411Updated 4 months ago
- Search-o1: Agentic Search-Enhanced Large Reasoning Models☆628Updated last week
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.☆732Updated 3 weeks ago
- System 2 Reasoning Link Collection☆794Updated 2 weeks ago
- A reading list on LLM based Synthetic Data Generation 🔥☆1,141Updated 3 months ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆2,448Updated this week
- Code and Data for Tau-Bench☆272Updated 3 weeks ago
- OO for LLMs☆634Updated last week
- Code for Husky, an open-source language agent that solves complex, multi-step reasoning tasks. Husky v1 addresses numerical, tabular and …☆336Updated 8 months ago
- ☆438Updated 4 months ago
- ☆1,326Updated 3 months ago
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering☆616Updated last month
- AIDE: the state-of-the-art machine learning engineer agent, generating machine learning solution code from natural language descriptions.☆740Updated last week
- 🤠 Agent-as-a-Judge and DevAI dataset☆322Updated last month
- Optimizing inference proxy for LLMs☆2,040Updated this week
- Training Large Language Model to Reason in a Continuous Latent Space☆877Updated 3 weeks ago
- AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and re…☆235Updated this week
- Code for Quiet-STaR☆713Updated 5 months ago
- Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard a…☆1,019Updated last month
- ☆806Updated 5 months ago
- ☆496Updated 3 months ago
- An Open Source Toolkit For LLM Distillation☆496Updated last month
- Fast State-of-the-Art Static Embeddings☆1,060Updated this week
- DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤☆969Updated 2 weeks ago