argilla-io / synthetic-data-generatorLinks
Build datasets using natural language
β558Updated 3 months ago
Alternatives and similar repositories for synthetic-data-generator
Users that are interested in synthetic-data-generator are comparing it to the libraries listed below
Sorting:
- π€ Benchmark Large Language Models Reliably On Your Dataβ423Updated 2 weeks ago
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. π¨π»βπ³β352Updated 7 months ago
- awesome synthetic (text) datasetsβ321Updated last week
- Automatically evaluate your LLMs in Google Colabβ682Updated last year
- π Automatically annotate papers using LLMsβ398Updated last month
- Generate High-Quality Synthetics, Train, Measure, and Evaluate in a Single Pipelineβ811Updated this week
- A Lightweight Library for AI Observabilityβ253Updated 10 months ago
- Banishing LLM Hallucinations Requires Rethinking Generalizationβ276Updated last year
- Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engineβ491Updated 5 months ago
- Collection of scripts and notebooks for OpenAI's latest GPT OSS modelsβ495Updated 4 months ago
- [EMNLP 2024 Demo] TinyAgent: Function Calling at the Edge!β461Updated last year
- Code for explaining and evaluating late chunking (chunked pooling)β482Updated last year
- Simple UI for debugging correlations of text embeddingsβ305Updated 7 months ago
- An open-source tool for LLM prompt optimization.β742Updated last week
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on taskβ¦β183Updated last year
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.β838Updated 11 months ago
- β695Updated 8 months ago
- Framework for enhancing LLMs for RAG tasks using fine-tuning.β764Updated last month
- A flexible, adaptive classification system for dynamic text classificationβ522Updated 3 months ago
- Structured information extraction from documentsβ320Updated last year
- An Open Source Toolkit For LLM Distillationβ823Updated 3 weeks ago
- β269Updated 6 months ago
- Together Open Deep Researchβ354Updated 9 months ago
- β236Updated last month
- Tool for generating high quality Synthetic datasetsβ1,463Updated 2 months ago
- Fast Semantic Text Deduplication & Filteringβ866Updated this week
- Tutorial for building LLM routerβ242Updated last year
- One click templates for inferencing Language Modelsβ224Updated last month
- Solving data for LLMs - Create quality synthetic datasets!β151Updated 11 months ago
- A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.β1,089Updated last week