meta-llama / synthetic-data-kitLinks
Tool for generating high quality Synthetic datasets
☆1,238Updated this week
Alternatives and similar repositories for synthetic-data-kit
Users that are interested in synthetic-data-kit are comparing it to the libraries listed below
Sorting:
- An open-source tool for general prompt optimization.☆637Updated this week
- UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection☆1,044Updated this week
- Synthetic data curation for post-training and structured data extraction☆1,511Updated 2 months ago
- Build datasets using natural language☆529Updated last week
- 🤗 Benchmark Large Language Models Reliably On Your Data☆398Updated this week
- A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.☆1,056Updated last month
- Fast State-of-the-Art Static Embeddings☆1,846Updated 3 weeks ago
- Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard a…☆1,647Updated this week
- ☆682Updated 5 months ago
- Cache-Augmented Generation: A Simple, Efficient Alternative to RAG☆1,380Updated 4 months ago
- ☆1,076Updated this week
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.☆826Updated 8 months ago
- Create large-scale synthetic training data for model distillation and evaluation☆531Updated this week
- 📝 Automatically annotate papers using LLMs☆355Updated 5 months ago
- Implementing the 4 agentic patterns from scratch☆1,574Updated 6 months ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆1,973Updated this week
- A lightweight, local-first, and free experiment tracking library from Hugging Face 🤗☆915Updated this week
- Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜☆1,624Updated 2 weeks ago
- Fast Semantic Text Deduplication & Filtering☆810Updated 3 weeks ago
- Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input☆873Updated 3 months ago
- ☆1,995Updated last week
- Environments for LLM Reinforcement Learning☆3,222Updated this week
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆2,899Updated this week
- A reading list on LLM based Synthetic Data Generation 🔥☆1,423Updated 3 months ago
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆451Updated last month
- open-source framework for creating and managing simulations populated with AI-powered agents. It provides an intuitive platform for desig…☆933Updated 8 months ago
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has st…☆1,264Updated 5 months ago
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,543Updated 4 months ago
- This repository shares end-to-end notebooks on how to use various Weaviate features and integrations!☆893Updated last week
- A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!☆1,151Updated 8 months ago