data-prep-kit / data-prep-kitLinks
Open source project for data preparation for GenAI applications
☆889Updated this week
Alternatives and similar repositories for data-prep-kit
Users that are interested in data-prep-kit are comparing it to the libraries listed below
Sorting:
- Granite Snack Cookbook -- easily consumable recipes (python notebooks) that showcase the capabilities of the Granite models☆335Updated last month
- An open-source tool for LLM prompt optimization.☆742Updated last week
- LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed by the PAIR t…☆512Updated 11 months ago
- CUGA is an open-source generalist agent for the enterprise, supporting complex task execution on web and APIs, OpenAPI/MCP integrations, …☆637Updated last month
- 🎨 NeMo Data Designer: A general library for generating high-quality synthetic data from scratch or based on seed data.☆620Updated last week
- Generate High-Quality Synthetics, Train, Measure, and Evaluate in a Single Pipeline☆811Updated this week
- Deploy and share agents with open infrastructure, free from vendor lock-in.☆897Updated last week
- OpenTelemetry Instrumentation for AI Observability☆809Updated this week
- Simple package to extract text with coordinates from programmatic PDFs☆230Updated this week
- ☆178Updated last month
- Tool for generating high quality Synthetic datasets☆1,463Updated 2 months ago
- Framework for enhancing LLMs for RAG tasks using fine-tuning.☆763Updated last month
- Open protocol for communication between AI agents, applications, and humans.☆923Updated 4 months ago
- ☆269Updated 6 months ago
- Ranking LLMs on agentic tasks☆208Updated 2 months ago
- This repository shares end-to-end notebooks on how to use various Weaviate features and integrations!☆933Updated last month
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆799Updated this week
- 👩🏻🍳 A collection of example notebooks using Haystack☆519Updated this week
- This NVIDIA RAG blueprint serves as a reference solution for a foundational Retrieval Augmented Generation (RAG) pipeline.☆431Updated this week
- Build datasets using natural language☆558Updated 3 months ago
- Taxonomy tree that will allow you to create models tuned with your data☆289Updated 4 months ago
- A python library to define and validate data types in Docling.☆219Updated this week
- This package, developed as part of our research detailed in the Chroma Technical Report, provides tools for text chunking and evaluation.…☆465Updated last month
- A Lightweight Library for AI Observability☆253Updated 10 months ago
- Additional packages (components, document stores and the likes) to extend the capabilities of Haystack☆178Updated this week
- Run the entire bee application stack using docker-compose☆154Updated 9 months ago
- Python SDK for Llama Stack☆192Updated this week
- An Awesome list of curated DSPy resources.☆505Updated last month
- InstructLab Core package. Use this to chat with a model and execute the InstructLab workflow to train a model using custom taxonomy data…☆1,399Updated this week
- Prompt Declaration Language (PDL) is a declarative prompt programming language.☆271Updated this week