argilla-io / synthetic-data-generatorLinks
Build datasets using natural language
☆493Updated last month
Alternatives and similar repositories for synthetic-data-generator
Users that are interested in synthetic-data-generator are comparing it to the libraries listed below
Sorting:
- 🤗 Benchmark Large Language Models Reliably On Your Data☆332Updated this week
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻🍳☆314Updated 3 weeks ago
- awesome synthetic (text) datasets☆282Updated 7 months ago
- Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engine☆467Updated 5 months ago
- A flexible, adaptive classification system for dynamic text classification☆240Updated last week
- Code for explaining and evaluating late chunking (chunked pooling)☆403Updated 6 months ago
- Framework for enhancing LLMs for RAG tasks using fine-tuning.☆741Updated last month
- Fast Semantic Text Deduplication & Filtering☆738Updated 3 weeks ago
- Tool for generating high quality Synthetic datasets☆958Updated 2 weeks ago
- ☆668Updated last month
- Generate large synthetic data using an LLM☆426Updated this week
- A Lightweight Library for AI Observability☆246Updated 4 months ago
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.☆797Updated 4 months ago
- An Open Source Toolkit For LLM Distillation☆651Updated 3 weeks ago
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…☆225Updated 7 months ago
- Automatically evaluate your LLMs in Google Colab☆643Updated last year
- Synthetic data curation for post-training and structured data extraction☆1,414Updated last week
- ☆149Updated 2 months ago
- An Awesome list of curated DSPy resources.☆346Updated 4 months ago
- Banishing LLM Hallucinations Requires Rethinking Generalization☆276Updated 11 months ago
- Make any LLM to think like OpenAI o1 and deepseek R1☆490Updated 4 months ago
- A reading list on LLM based Synthetic Data Generation 🔥☆1,310Updated 3 weeks ago
- Late Interaction Models Training & Retrieval☆452Updated 2 weeks ago
- 📝 Automatically annotate papers using LLMs☆324Updated 2 months ago
- An open-source tool for seamless migration from other LLMs to Llama, and for general prompt optimization.☆493Updated this week
- Fast State-of-the-Art Static Embeddings☆1,740Updated 3 weeks ago
- Together Open Deep Research☆309Updated 2 months ago
- Solving data for LLMs - Create quality synthetic datasets!☆149Updated 5 months ago
- Simple examples using Argilla tools to build AI☆53Updated 7 months ago
- Let's build better datasets, together!☆259Updated 6 months ago