meta-llama / synthetic-data-kitLinks
Tool for generating high quality Synthetic datasets
☆1,306Updated 3 weeks ago
Alternatives and similar repositories for synthetic-data-kit
Users that are interested in synthetic-data-kit are comparing it to the libraries listed below
Sorting:
- An open-source tool for LLM prompt optimization.☆666Updated 3 weeks ago
- Build datasets using natural language☆532Updated last month
- UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection☆1,056Updated this week
- 🤗 Benchmark Large Language Models Reliably On Your Data☆404Updated 2 weeks ago
- Synthetic data curation for post-training and structured data extraction☆1,535Updated 2 months ago
- ☆1,141Updated 2 weeks ago
- Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard a…☆1,673Updated 2 weeks ago
- Fast State-of-the-Art Static Embeddings☆1,863Updated last week
- A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.☆1,061Updated 2 months ago
- Cache-Augmented Generation: A Simple, Efficient Alternative to RAG☆1,414Updated 4 months ago
- ☆684Updated 5 months ago
- Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜☆1,635Updated last month
- Create large-scale synthetic training data for model distillation and evaluation☆619Updated last week
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has st…☆1,342Updated 5 months ago
- Open source project for data preparation for GenAI applications☆822Updated last week
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.☆824Updated 8 months ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆2,903Updated this week
- The NVIDIA NeMo Agent toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents.☆1,440Updated this week
- A system for agentic LLM-powered data processing and ETL☆3,001Updated 2 weeks ago
- Optimizing inference proxy for LLMs☆3,018Updated last week
- Implementing the 4 agentic patterns from scratch☆1,597Updated 7 months ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆2,021Updated this week
- 📝 Automatically annotate papers using LLMs☆355Updated 6 months ago
- Kickstart your LLMOps initiative with a flexible, robust, and productive Python package.☆883Updated 8 months ago
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs☆3,518Updated 5 months ago
- GenAI Processors is a lightweight Python library that enables efficient, parallel content processing.☆1,979Updated 2 weeks ago
- A framework for comprehensive diagnosis and optimization of agents using simulated, realistic synthetic interactions☆1,139Updated 3 months ago
- Use LOTUS to process all of your datasets with LLMs and embeddings. Enjoy up to 1000x speedups with fast, accurate query processing, that…☆1,323Updated last week
- Fast Semantic Text Deduplication & Filtering☆816Updated 2 weeks ago
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,560Updated 4 months ago