data-prep-kit / data-prep-kitLinks
Open source project for data preparation for GenAI applications
☆897Updated this week
Alternatives and similar repositories for data-prep-kit
Users that are interested in data-prep-kit are comparing it to the libraries listed below
Sorting:
- Granite Snack Cookbook -- easily consumable recipes (python notebooks) that showcase the capabilities of the Granite models☆343Updated last week
- Deploy and share agents with open infrastructure, free from vendor lock-in.☆944Updated this week
- CUGA is an open-source generalist agent for the enterprise, supporting complex task execution on web and APIs, OpenAPI/MCP integrations, …☆665Updated 2 weeks ago
- An open-source tool for LLM prompt optimization.☆759Updated last week
- Run the entire bee application stack using docker-compose☆154Updated 10 months ago
- Tool for generating high quality Synthetic datasets☆1,484Updated 3 months ago
- This NVIDIA RAG blueprint serves as a reference solution for a foundational Retrieval Augmented Generation (RAG) pipeline.☆460Updated this week
- ☆185Updated 2 weeks ago
- VectorHub is a free, open-source learning website for people (software developers to senior ML architects) interested in adding vector re…☆511Updated this week
- EvalAssist is an open-source project that simplifies using large language models as evaluators (LLM-as-a-Judge) of the output of other la…☆93Updated 2 months ago
- OpenTelemetry Instrumentation for AI Observability☆843Updated this week
- LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed by the PAIR t…☆518Updated 11 months ago
- Simple package to extract text with coordinates from programmatic PDFs☆236Updated last week
- 👩🏻🍳 A collection of example notebooks using Haystack☆523Updated last week
- InstructLab Core package. Use this to chat with a model and execute the InstructLab workflow to train a model using custom taxonomy data…☆1,406Updated last week
- 🤗 Benchmark Large Language Models Reliably On Your Data☆426Updated last month
- Docling core data types and transformations☆225Updated last week
- Open protocol for communication between AI agents, applications, and humans.☆934Updated 5 months ago
- AI-Powered Data Processing: Use LOTUS to process all of your datasets with LLMs and embeddings. Enjoy up to 1000x speedups with fast, acc…☆1,545Updated this week
- Generate High-Quality Synthetics, Train, Measure, and Evaluate in a Single Pipeline☆830Updated this week
- This repository shares end-to-end notebooks on how to use various Weaviate features and integrations!☆934Updated last week
- Build datasets using natural language☆566Updated 4 months ago
- TAG-Bench: A benchmark for table-augmented generation (TAG)☆766Updated 10 months ago
- UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection☆1,106Updated this week
- Python SDK for Llama Stack☆192Updated this week
- ☆270Updated 7 months ago
- This package, developed as part of our research detailed in the Chroma Technical Report, provides tools for text chunking and evaluation.…☆470Updated last month
- 🍁 Sycamore is an LLM-powered search and analytics platform for unstructured data.☆585Updated this week
- Framework for enhancing LLMs for RAG tasks using fine-tuning.☆764Updated last month
- An Awesome list of curated DSPy resources.☆511Updated last month