meta-llama / synthetic-data-kitLinks
Tool for generating high quality Synthetic datasets
☆1,476Updated 3 months ago
Alternatives and similar repositories for synthetic-data-kit
Users that are interested in synthetic-data-kit are comparing it to the libraries listed below
Sorting:
- An open-source tool for LLM prompt optimization.☆754Updated 3 weeks ago
- UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection☆1,101Updated last week
- 🤗 Benchmark Large Language Models Reliably On Your Data☆425Updated last month
- Synthetic data curation for post-training and structured data extraction☆1,618Updated last week
- Build datasets using natural language☆559Updated 4 months ago
- ☆695Updated 9 months ago
- An interface library for RL post training with environments.☆1,090Updated last week
- Fast State-of-the-Art Static Embeddings☆1,990Updated last month
- Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard a…☆2,044Updated last month
- 🎨 NeMo Data Designer: A general library for generating high-quality synthetic data from scratch or based on seed data.☆674Updated this week
- A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.☆1,097Updated this week
- Generate High-Quality Synthetics, Train, Measure, and Evaluate in a Single Pipeline☆827Updated this week
- Cache-Augmented Generation: A Simple, Efficient Alternative to RAG☆1,464Updated 8 months ago
- A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!☆1,185Updated last year
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆3,068Updated 2 weeks ago
- Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜☆1,865Updated 3 weeks ago
- Open source project for data preparation for GenAI applications☆891Updated this week
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆2,279Updated last week
- The NVIDIA NeMo Agent toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents.☆1,781Updated this week
- 📝 Automatically annotate papers using LLMs☆401Updated 2 months ago
- ☆2,150Updated 3 weeks ago
- A lightweight, local-first, and 🆓 experiment tracking library from Hugging Face 🤗☆1,234Updated last week
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆496Updated 5 months ago
- An Open Source Toolkit For LLM Distillation☆846Updated last month
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.☆841Updated last year
- A system for agentic LLM-powered data processing and ETL☆3,501Updated 2 weeks ago
- A framework for comprehensive diagnosis and optimization of agents using simulated, realistic synthetic interactions☆1,164Updated last month
- Fast Multimodal Semantic Deduplication & Filtering☆879Updated last week
- This repository shares end-to-end notebooks on how to use various Weaviate features and integrations!☆933Updated this week
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,592Updated last month