meta-llama / synthetic-data-kitLinks
Tool for generating high quality Synthetic datasets
β1,183Updated last month
Alternatives and similar repositories for synthetic-data-kit
Users that are interested in synthetic-data-kit are comparing it to the libraries listed below
Sorting:
- An open-source tool for general prompt optimization.β616Updated 3 weeks ago
- π€ Benchmark Large Language Models Reliably On Your Dataβ391Updated last week
- UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detectionβ997Updated last week
- Build datasets using natural languageβ523Updated 4 months ago
- Synthetic data curation for post-training and structured data extractionβ1,495Updated last month
- Fast State-of-the-Art Static Embeddingsβ1,833Updated this week
- Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard aβ¦β1,564Updated 8 months ago
- β679Updated 4 months ago
- A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.β1,046Updated 3 weeks ago
- Implementing the 4 agentic patterns from scratchβ1,538Updated 5 months ago
- Cache-Augmented Generation: A Simple, Efficient Alternative to RAGβ1,369Updated 3 months ago
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.β820Updated 7 months ago
- This repository shares end-to-end notebooks on how to use various Weaviate features and integrations!β860Updated last week
- Generate large synthetic dataβ441Updated last week
- Pixeltable β AI Data infrastructure providing a declarative, incremental approach for multimodal workloads.β786Updated last week
- A framework for comprehensive diagnosis and optimization of agents using simulated, realistic synthetic interactionsβ1,124Updated 2 months ago
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has stβ¦β1,260Updated 4 months ago
- The NVIDIA NeMo Agent toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents.β1,338Updated this week
- Readymade evaluators for your LLM appsβ716Updated last week
- Recipes for shrinking, optimizing, customizing cutting edge vision models. πβ1,586Updated 3 weeks ago
- Superlinked is a Python framework for AI Engineers building high-performance search & recommendation applications that combine structured β¦β1,334Updated this week
- π Automatically annotate papers using LLMsβ349Updated 4 months ago
- On the Theoretical Limitations of Embedding-Based Retrievalβ490Updated last week
- β1,967Updated last week
- Collection of scripts and notebooks for OpenAI's latest GPT OSS modelsβ428Updated 2 weeks ago
- A system for agentic LLM-powered data processing and ETLβ2,812Updated last week
- Large Concept Models: Language modeling in a sentence representation spaceβ2,277Updated 7 months ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifiβ¦β2,878Updated this week
- AdalFlow: The library to build & auto-optimize LLM applications.β3,672Updated 3 weeks ago
- Inference, Fine Tuning and many more recipes with Gemma family of modelsβ267Updated last month