meta-llama / synthetic-data-kitLinks

Tool for generating high quality Synthetic datasets

☆1,081

Alternatives and similar repositories for synthetic-data-kit

Users that are interested in synthetic-data-kit are comparing it to the libraries listed below

Sorting:

meta-llama / llama-prompt-ops
An open-source tool for general prompt optimization.
☆576Updated this week
huggingface / huggingface-llama-recipes
☆677Updated 3 months ago
huggingface / evaluation-guidebook
Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard a…
☆1,498Updated 6 months ago
argilla-io / synthetic-data-generator
Build datasets using natural language
☆505Updated 2 months ago
huggingface / yourbench
🤗 Benchmark Large Language Models Reliably On Your Data
☆367Updated this week
cvs-health / uqlm
UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection
☆824Updated this week
bespokelabsai / curator
Synthetic data curation for post-training and structured data extraction
☆1,464Updated 3 weeks ago
MinishLab / model2vec
Fast State-of-the-Art Static Embeddings
☆1,782Updated this week
togethercomputer / together-cookbook
A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.
☆987Updated last week
merveenoyan / smol-vision
Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜
☆1,540Updated last week
StacklokLabs / promptwright
Generate large synthetic data using an LLM
☆438Updated this week
mistralai / cookbook
☆1,927Updated this week
AnswerDotAI / byaldi
Use late-interaction multi-modal models such as ColPali in just a few lines of code.
☆806Updated 6 months ago
Thytu / Agentarium
open-source framework for creating and managing simulations populated with AI-powered agents. It provides an intuitive platform for desig…
☆925Updated 6 months ago
data-prep-kit / data-prep-kit
Open source project for data preparation for GenAI applications
☆754Updated this week
NVIDIA-AI-Blueprints / multimodal-pdf-data-extraction
NVIDIA AI Blueprint for multimodal PDF data extraction for enterprise RAG
☆342Updated 4 months ago
NVIDIA / NeMo-Agent-Toolkit
The NVIDIA NeMo Agent toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents.
☆1,149Updated this week
neural-maze / agentic-patterns-course
Implementing the 4 agentic patterns from scratch
☆1,463Updated 4 months ago
tjmlabs / ColiVara
Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has st…
☆1,183Updated 3 months ago
weaviate / recipes
This repository shares end-to-end notebooks on how to use various Weaviate features and integrations!
☆814Updated this week
hhhuang / CAG
Cache-Augmented Generation: A Simple, Efficient Alternative to RAG
☆1,351Updated 2 months ago
willccbb / verifiers
Verifiers for LLM Reinforcement Learning
☆1,621Updated this week
neuml / annotateai
📝 Automatically annotate papers using LLMs
☆332Updated 3 months ago
plurai-ai / intellagent
A framework for comprehensive diagnosis and optimization of agents using simulated, realistic synthetic interactions
☆1,100Updated last month
HazyResearch / minions
Big & Small LLMs working together
☆1,088Updated this week
unslothai / notebooks
100+ Fine-tuning LLM Notebooks on Google Colab, Kaggle, and more.
☆2,697Updated last week
SakanaAI / self-adaptive-llms
A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!
☆1,131Updated 6 months ago
huggingface / lighteval
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
☆1,766Updated this week
KruxAI / ragbuilder
A toolkit to create optimal Production-readyRetrieval Augmented Generation(RAG) setup for your data
☆1,451Updated 2 months ago
huggingface / huggingface-gemma-recipes
Inference, Fine Tuning and many more recipes with Gemma family of models
☆260Updated 2 weeks ago