argilla-io / synthetic-data-generatorLinks
Build datasets using natural language
β527Updated 4 months ago
Alternatives and similar repositories for synthetic-data-generator
Users that are interested in synthetic-data-generator are comparing it to the libraries listed below
Sorting:
- π€ Benchmark Large Language Models Reliably On Your Dataβ391Updated last week
- Create large-scale synthetic training data for model distillation and evaluationβ444Updated this week
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. π¨π»βπ³β331Updated 3 months ago
- π Automatically annotate papers using LLMsβ353Updated 4 months ago
- Code for explaining and evaluating late chunking (chunked pooling)β447Updated 8 months ago
- Banishing LLM Hallucinations Requires Rethinking Generalizationβ276Updated last year
- Automatically evaluate your LLMs in Google Colabβ658Updated last year
- Collection of scripts and notebooks for OpenAI's latest GPT OSS modelsβ437Updated 3 weeks ago
- A Lightweight Library for AI Observabilityβ251Updated 6 months ago
- β155Updated 4 months ago
- Framework for enhancing LLMs for RAG tasks using fine-tuning.β748Updated 3 months ago
- awesome synthetic (text) datasetsβ296Updated 2 months ago
- Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engineβ480Updated last month
- Tool for generating high quality Synthetic datasetsβ1,183Updated last month
- An open-source tool for general prompt optimization.β616Updated 3 weeks ago
- β262Updated 2 months ago
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on taskβ¦β178Updated 11 months ago
- [EMNLP 2024 Demo] TinyAgent: Function Calling at the Edge!β445Updated last year
- Together Open Deep Researchβ346Updated 5 months ago
- β680Updated 4 months ago
- β231Updated 2 months ago
- A flexible, adaptive classification system for dynamic text classificationβ442Updated 3 weeks ago
- LettuceDetect is a hallucination detection framework for RAG applications.β487Updated last week
- An Open Source Toolkit For LLM Distillationβ724Updated 2 months ago
- Solving data for LLMs - Create quality synthetic datasets!β150Updated 7 months ago
- II-Researcher: a new open-source framework designed to aid building search / research agentsβ470Updated last month
- Simple UI for debugging correlations of text embeddingsβ291Updated 3 months ago
- Tutorial for building LLM routerβ226Updated last year
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language Mβ¦β240Updated 10 months ago
- Fast Semantic Text Deduplication & Filteringβ803Updated last week