argilla-io / synthetic-data-generator
Build datasets using natural language
☆396Updated last week
Alternatives and similar repositories for synthetic-data-generator:
Users that are interested in synthetic-data-generator are comparing it to the libraries listed below
- Generate large synthetic data using an LLM☆386Updated this week
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻🍳☆259Updated 2 months ago
- Solving data for LLMs - Create quality synthetic datasets!☆145Updated last month
- An Awesome list of curated DSPy resources.☆288Updated last week
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆205Updated this week
- awesome synthetic (text) datasets☆263Updated 4 months ago
- Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engine☆431Updated last month
- A Lightweight Library for AI Observability☆233Updated last week
- 📝 Automatically annotate papers using LLMs☆297Updated 2 months ago
- An Open Source Toolkit For LLM Distillation☆516Updated last month
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.☆741Updated last month
- Automatically evaluate your LLMs in Google Colab☆593Updated 9 months ago
- This project showcases an LLMOps pipeline that fine-tunes a small-size LLM model to prepare for the outage of the service LLM.☆301Updated last week
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆421Updated 5 months ago
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…☆203Updated 4 months ago
- Code for explaining and evaluating late chunking (chunked pooling)☆331Updated 2 months ago
- A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.☆705Updated last week
- Automatic evals for LLMs☆304Updated this week
- Fast State-of-the-Art Static Embeddings☆1,077Updated this week
- ☆618Updated 2 months ago
- This is the official repository for Auto-RAG.☆197Updated last month
- Structured information extraction from documents☆310Updated 5 months ago
- This is the reproduction repository for my 🤗 Hugging Face blog post on synthetic data☆67Updated last year
- Fast Semantic Text Deduplication☆545Updated this week
- A library for easily merging multiple LLM experts, and efficiently train the merged LLM.☆445Updated 6 months ago
- ☆210Updated 2 months ago
- [EMNLP 2024 Demo] TinyAgent: Function Calling at the Edge!☆370Updated 5 months ago
- ☆335Updated 3 weeks ago
- An example of multi-agent orchestration with llama-index☆400Updated last month