meta-llama / synthetic-data-kitLinks
Tool for generating high quality Synthetic datasets
β948Updated last week
Alternatives and similar repositories for synthetic-data-kit
Users that are interested in synthetic-data-kit are comparing it to the libraries listed below
Sorting:
- An open-source tool for seamless migration from other LLMs to Llama, and for general prompt optimization.β482Updated 2 weeks ago
- π€ Benchmark Large Language Models Reliably On Your Dataβ329Updated this week
- Build datasets using natural languageβ492Updated last month
- Synthetic data curation for post-training and structured data extractionβ1,404Updated this week
- UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detectionβ747Updated this week
- Recipes for shrinking, optimizing, customizing cutting edge vision models. πβ1,495Updated 2 weeks ago
- Implementing the 4 agentic patterns from scratchβ1,375Updated 3 months ago
- Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embedding. ColiVara has stβ¦β1,136Updated last month
- β668Updated last month
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.β797Updated 4 months ago
- Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard aβ¦β1,434Updated 5 months ago
- Fast State-of-the-Art Static Embeddingsβ1,740Updated 2 weeks ago
- A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.β950Updated last week
- Cache-Augmented Generation: A Simple, Efficient Alternative to RAGβ1,319Updated 3 weeks ago
- A toolkit to create optimal Production-readyRetrieval Augmented Generation(RAG) setup for your dataβ1,432Updated last month
- Generate large synthetic data using an LLMβ426Updated this week
- This repository shares end-to-end notebooks on how to use various Weaviate features and integrations!β777Updated this week
- π Automatically annotate papers using LLMsβ324Updated 2 months ago
- This repository provides an advanced Retrieval-Augmented Generation (RAG) solution for complex question answering. It uses sophisticated β¦β1,270Updated this week
- β660Updated 2 weeks ago
- A production-ready FastAPI template for building AI agent applications with LangGraph integration. This template provides a robust foundaβ¦β849Updated last month
- β1,829Updated 2 weeks ago
- Large Concept Models: Language modeling in a sentence representation spaceβ2,233Updated 4 months ago
- A reading list on LLM based Synthetic Data Generation π₯β1,306Updated 2 weeks ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifiβ¦β2,757Updated last week
- Fine-tune LLMs for free with 100+ Notebooks on Google Colab, Kaggle, and more.β2,303Updated last week
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. π¨π»βπ³β313Updated 2 weeks ago
- π₯€ RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with DuckDB or PostgreSQLβ1,015Updated last week
- π π§ PageIndex: Document Index System for Reasoning-based RAGβ1,056Updated last week
- AdalFlow: The library to build & auto-optimize LLM applications.β3,328Updated 2 months ago