argilla-io / synthetic-data-generatorLinks
Build datasets using natural language
β534Updated last month
Alternatives and similar repositories for synthetic-data-generator
Users that are interested in synthetic-data-generator are comparing it to the libraries listed below
Sorting:
- π€ Benchmark Large Language Models Reliably On Your Dataβ406Updated 3 weeks ago
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. π¨π»βπ³β336Updated 4 months ago
- awesome synthetic (text) datasetsβ302Updated 3 months ago
- A Lightweight Library for AI Observabilityβ251Updated 8 months ago
- Automatically evaluate your LLMs in Google Colabβ660Updated last year
- Banishing LLM Hallucinations Requires Rethinking Generalizationβ275Updated last year
- π Automatically annotate papers using LLMsβ357Updated 6 months ago
- Tutorial for building LLM routerβ231Updated last year
- Complete pipeline for Training Model Behavior in Agentic Systemsβ635Updated this week
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on taskβ¦β179Updated last year
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language Mβ¦β240Updated 11 months ago
- Collection of scripts and notebooks for OpenAI's latest GPT OSS modelsβ463Updated 2 months ago
- β687Updated 5 months ago
- Code for explaining and evaluating late chunking (chunked pooling)β453Updated 10 months ago
- [EMNLP 2024 Demo] TinyAgent: Function Calling at the Edge!β453Updated last year
- Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engineβ486Updated 3 months ago
- An Open Source Toolkit For LLM Distillationβ744Updated 3 months ago
- β158Updated 6 months ago
- Ranking LLMs on agentic tasksβ194Updated last month
- β264Updated 4 months ago
- Simple UI for debugging correlations of text embeddingsβ296Updated 4 months ago
- An open-source tool for LLM prompt optimization.β666Updated 3 weeks ago
- β232Updated 3 months ago
- [ACL'25] Official Code for LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMsβ314Updated 3 months ago
- This is the official repository for Auto-RAG.β226Updated 3 months ago
- A compact LLM pretrained in 9 days by using high quality dataβ330Updated 6 months ago
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.β244Updated 2 months ago
- One click templates for inferencing Language Modelsβ215Updated 2 months ago
- A library for easily merging multiple LLM experts, and efficiently train the merged LLM.β494Updated last year
- Solving data for LLMs - Create quality synthetic datasets!β151Updated 9 months ago