argilla-io / synthetic-data-generator
Build datasets using natural language
☆305Updated last week
Alternatives and similar repositories for synthetic-data-generator:
Users that are interested in synthetic-data-generator are comparing it to the libraries listed below
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻🍳☆251Updated last month
- Solving data for LLMs - Create quality synthetic datasets!☆144Updated last week
- 📝 Automatically annotate papers using LLMs☆281Updated last month
- Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engine☆426Updated 2 weeks ago
- WIP - Allows you to create DSPy pipelines using ComfyUI☆188Updated last month
- Flexible and powerful multi-agent AI framework☆339Updated last week
- Generate large synthetic data using an LLM☆368Updated this week
- This is the official repository for Auto-RAG.☆185Updated 2 weeks ago
- Structured information extraction from documents☆299Updated 4 months ago
- Banishing LLM Hallucinations Requires Rethinking Generalization☆269Updated 6 months ago
- awesome synthetic (text) datasets☆256Updated 3 months ago
- Simple examples using Argilla tools to build AI☆52Updated 2 months ago
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.☆705Updated this week
- A Lightweight Library for AI Observability☆231Updated this week
- An Open Source Toolkit For LLM Distillation☆442Updated 3 weeks ago
- Official repository for "DynaSaur: Large Language Agents Beyond Predefined Actions"☆300Updated last month
- A user interface for DSPy☆135Updated 3 months ago
- Code for Husky, an open-source language agent that solves complex, multi-step reasoning tasks. Husky v1 addresses numerical, tabular and …☆332Updated 7 months ago
- ☆90Updated last month
- ☆243Updated last month
- Official code of the paper "SimGRAG: Leveraging Similar Subgraphs for Knowledge Graphs Driven Retrieval-Augmented Generation"☆87Updated last month
- This project showcases an LLMOps pipeline that fine-tunes a small-size LLM model to prepare for the outage of the service LLM.☆294Updated last month
- Synthetic Data curation for post-training and structured data extraction☆575Updated this week
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆394Updated 4 months ago
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆222Updated 9 months ago
- Code for explaining and evaluating late chunking (chunked pooling)☆314Updated last month
- Fast parallel LLM inference for MLX☆153Updated 6 months ago
- Serverless Modal + FastAPI + React + ColPali + Qdrant + GPT4o Vision RAG (V-RAG) Demo☆340Updated 2 months ago
- Framework for building, orchestrating and deploying multi-agent systems. Managed by OpenAI Solutions team. Experimental framework.☆86Updated 3 months ago