argilla-io / synthetic-data-generatorLinks

Build datasets using natural language

☆507

Alternatives and similar repositories for synthetic-data-generator

Users that are interested in synthetic-data-generator are comparing it to the libraries listed below

Sorting:

huggingface / yourbench
🤗 Benchmark Large Language Models Reliably On Your Data
☆381Updated this week
StacklokLabs / promptwright
Generate large synthetic data using an LLM
☆438Updated this week
tonywu71 / colpali-cookbooks
Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻‍🍳
☆318Updated 2 months ago
mlabonne / llm-autoeval
Automatically evaluate your LLMs in Google Colab
☆649Updated last year
neuml / annotateai
📝 Automatically annotate papers using LLMs
☆332Updated 3 months ago
cfahlgren1 / observers
A Lightweight Library for AI Observability
☆250Updated 5 months ago
lamini-ai / Lamini-Memory-Tuning
Banishing LLM Hallucinations Requires Rethinking Generalization
☆276Updated last year
IntelLabs / RAG-FiT
Framework for enhancing LLMs for RAG tasks using fine-tuning.
☆747Updated 2 months ago
adithya-s-k / VARAG
Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engine
☆477Updated 2 weeks ago
huggingface / huggingface-gemma-recipes
Inference, Fine Tuning and many more recipes with Gemma family of models
☆262Updated 2 weeks ago
SqueezeAILab / TinyAgent
[EMNLP 2024 Demo] TinyAgent: Function Calling at the Edge!
☆429Updated 11 months ago
jina-ai / late-chunking
Code for explaining and evaluating late chunking (chunked pooling)
☆427Updated 7 months ago
jina-ai / correlations
Simple UI for debugging correlations of text embeddings
☆288Updated 2 months ago
togethercomputer / open_deep_research
Together Open Deep Research
☆331Updated 3 months ago
huggingface / huggingface-llama-recipes
☆677Updated 3 months ago
rungalileo / agent-leaderboard
Ranking LLMs on agentic tasks
☆176Updated 3 weeks ago
cvs-health / uqlm
UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection
☆824Updated this week
arcee-ai / DistillKit
An Open Source Toolkit For LLM Distillation
☆698Updated 3 weeks ago
ibm-granite / granite-3.0-language-models
☆260Updated last month
menloresearch / ReZero
☆155Updated 3 months ago
meta-llama / synthetic-data-kit
Tool for generating high quality Synthetic datasets
☆1,100Updated this week
davanstrien / awesome-synthetic-datasets
awesome synthetic (text) datasets
☆291Updated last month
ictnlp / Auto-RAG
This is the official repository for Auto-RAG.
☆218Updated 2 weeks ago
google / lmeval
☆222Updated last month
meta-llama / llama-prompt-ops
An open-source tool for general prompt optimization.
☆590Updated this week
anyscale / llm-router
Tutorial for building LLM router
☆221Updated last year
BhabhaAI / dataformer
Solving data for LLMs - Create quality synthetic datasets!
☆150Updated 6 months ago
codelion / adaptive-classifier
A flexible, adaptive classification system for dynamic text classification
☆351Updated 2 weeks ago
arcee-ai / EvolKit
EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…
☆230Updated 9 months ago
vndee / llm-sandbox
Lightweight and portable LLM sandbox runtime (code interpreter) Python library.
☆421Updated this week